Results 1 to 9 of 9

Thread: Mmaped buffers: Memory leaks and GART errors

Hybrid View

  1. #1
    Join Date
    May 2013

    Mmaped buffers: Memory leaks and GART errors

    We are trying to process images from a framegrabber. The data appears on a mmaped buffer.

    Every time we use the mmaped buffer on OpenCL we are getting an error message on syslog and the Slab section of /proc/mem_info increases. On our system we are leaking around 1 MB per second!

    syslog error: [fglrx:MCIL_LockMemory] *ERROR* Could not lock memory into GART space

    In order to demonstrate this error we have made a simple program with a loop that maps and unmaps a mmaped buffer. While running the program please run dmesg and examine /proc/mem_info. We are using the AMD implementation of OpenCL: Driver linux_x64 13.4.

    Is there anything special we should do when handling mmaped memory?

    This is the demo program:

    #include <unistd.h>
    #include <stdio.h>
    #include <string.h>
    #include <stdlib.h>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <sys/mman.h>
    #include <fcntl.h>
    #include <errno.h>
    #include <CL/opencl.h>

    #define BUF_SIZE 4096

    #define cl_err_exit(errnum, errstring){ \
    if (errnum != CL_SUCCESS) { \
    fprintf(stderr, "%s failed on line %d: %d\n", errstring, __LINE__, errnum); \
    exit(1); \
    } \

    int main ()
    cl_platform_id platform_id = NULL;
    cl_device_id device_id = NULL;
    cl_context context = NULL;
    cl_command_queue gpu_queue = NULL;
    cl_int err;

    cl_mem pinned_buffer = NULL;
    cl_mem device_buffer = NULL;

    void *pinned_mem;
    int fd;

    fd = open("/dev/mem", O_RDONLY);
    if (fd == -1) {

    pinned_mem = mmap(NULL, BUF_SIZE, PROT_READ, MAP_SHARED, fd, 0);
    if (pinned_mem == MAP_FAILED) {

    err = clGetPlatformIDs(1, &platform_id, NULL);
    cl_err_exit(err, "clGetPlatformIDs");

    err = clGetDeviceIDs(platform_id, CL_DEVICE_TYPE_GPU, 1, &device_id, NULL);
    cl_err_exit(err, "clGetDeviceIDs");

    context = clCreateContext(NULL, 1, &device_id, NULL, NULL, &err);
    cl_err_exit(err, "clCreateContext");
    gpu_queue = clCreateCommandQueue(context, device_id, 0, &err);
    cl_err_exit(err, "clCreateCommandQueue");
    while (1) {

    pinned_buffer = clCreateBuffer(context, CL_MEM_USE_HOST_PTR, BUF_SIZE, pinned_mem, &err);
    cl_err_exit(err, "clCreateBuffer");

    device_buffer = clCreateBuffer(context, CL_MEM_READ_ONLY, BUF_SIZE, NULL, NULL);
    cl_err_exit(err, "clCreateBuffer");

    pinned_mem = (float *) clEnqueueMapBuffer(gpu_queue, pinned_buffer, CL_TRUE, CL_MAP_WRITE, 0, BUF_SIZE, 0, NULL, NULL, &err);
    cl_err_exit(err, "clEnqueueMapBuffer");

    err = clEnqueueWriteBuffer(gpu_queue, device_buffer, CL_FALSE, 0, BUF_SIZE, pinned_mem, 0, NULL, NULL);
    cl_err_exit(err, "clEnqueueWriteBuffer");

    err = clEnqueueUnmapMemObject(gpu_queue, pinned_buffer, pinned_mem, 0, NULL, NULL);
    cl_err_exit(err, "clEnqueueUnmapMemObject");

    munmap(pinned_mem, BUF_SIZE);

    return 0;

  2. #2
    Senior Member
    Join Date
    Oct 2012
    Correct me if i'm wrong but you dont have to clEnqueueWriteBuffer to update the data. All you have to do is write to the buffer and unmap it. that should sync the memory.

  3. #3
    Join Date
    May 2013
    Hello clint3112

    Right now there is no OpenCL kernel in the example, to rule out that there is a problem with it. I believe that if there were a kernel launch before clEnqueueUnmapMemObject you would need the enqueue writebuffer to process valid data on the gpu....

    Anyway: Even If comment the clEnqueueWriteBuffer I still get the "Could not lock memory into GART space" error and the Slab keeps rising....

  4. #4
    Junior Member
    Join Date
    Dec 2011
    Not sure exactly why you're doing the mmap yourself. Pinning the memory won't necessarily gain the performance you require. To get it working, just let the runtime allocate the memory for you - AMD should be pinning it if you do CL_MEM_ALLOC_HOST_PTR (they'll create the space). The point, is that to gain advantages from pinned memory it needs to be pinned && DMA Host Accessible.

    Also, you don't need the clEnqueueWriteBuffer as clint said. You just map -> edit -> Unmap.

  5. #5
    Join Date
    May 2013

    In my real application I am using a framegrabber. That framegrabber places the frames on a memory mapped area. I dont want to copy those frames to other pinned memory, I want to use it straight from the framegrabber.When I try to do that I see the gart error on the syslog and the driver leaks memory.

    This error happens with every memory mapped area, not only with that particular framegrabber. We made that program so everyone could experience with the error, it is cheaper than buying a framegrabber .

    The program that I propose shows the gart error and you can monitor that the Slab section keeps rising on an AMD implementation.


  6. #6
    Junior Member
    Join Date
    Dec 2011
    Seem to need to be device accessible (

    Have you tried nvidia?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Proudly hosted by Digital Ocean