Hello,
I am tying to use image2d mem object to perform operations on pixels, with YUV images. For testing, I juste use a uchar array, that I copy into image2d object.
It works well with small arrays.
The problem is that I cannot use arrays with dimension bigger than 12864 or 64128 (8192 bytes), which is poor since I need to work with HD images x).
Here I create my image2d:
cl_image_format image_format;
image_format.image_channel_data_type=CL_UNSIGNED_INT8;
image_format.image_channel_order=CL_RGBA;
//Create OpenCL Image
g_inputImage= clCreateImage2D (g_context,
CL_MEM_READ_ONLY,
&image_format,
(size_t)stride/4, //RGBA! 4 bytes per pixel
(size_t)arrayNrows,
0,
NULL,
&ret);
Here I copy the array into the image object:
//Parameters for clEnqueueWriteImage
size_t origin[3]={0, 0, 0};
size_t region[3]={stride/4,arrayNrows,1}; //RGBA! 4 bytes per pixelerr = clEnqueueWriteImage (g_cmd_queue,
g_inputImage,
CL_TRUE,
origin,
region,
stride,
0,
inputArray,
0,
NULL,
NULL);
I take into account the fact that I’m using RGBA, whereas my input array simulates a 1 component image (1byte/pixel). Here the “stride” is just the width of my image, in bytes. Then the RGBA picture created has 4 times less pixels than the “real” picture, but it’s not a problem for my test.
The program stops after clEnqueueNDRangeKernel returns 1 instead of CL_SUCCESS. As 1 does not bring any information about the error, I do not understand why it crashes. The kernel is not executed at all.
Here is how I run the kernel:
// set work-item dimensions
size_t global_work_size[2];
global_work_size[1] = (size_t) stride/4; // using 4 element vectors!
global_work_size[0] = (size_t) arrayNrows;///2; //number of quad items in input array
size_t local_work_size[2];
local_work_size[1] = (size_t) stride/4;
local_work_size[0] = (size_t) 1;
nd=2; // execute kernel (2D)
if ((err = clEnqueueNDRangeKernel(g_cmd_queue, g_kernel, nd, NULL, global_work_size, local_work_size, 0, NULL, NULL) != CL_SUCCESS))
{
printf("ERROR: Failed to execute kernel
");
return false;
}
Here is the output, with some information about the CL_DEVICE, and the memory:
No command line arguments specified, using default values.
Initializing OpenCL runtime...
Reading file 'ker4_FLADIntra_sum_c.cl' (size 3145 bytes)
CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS= 3
CL_DEVICE_MAX_WORK_ITEM_SIZES= 1024, 1024, 1024
CL_DEVICE_MAX_WORK_GROUP_SIZE= 1024
CL_DEVICE_ADDRESS_BITS= 64
CL_DEVICE_IMAGE2D_MAX_WIDTH= 8192
CL_DEVICE_IMAGE2D_MAX_HEIGHT= 8192
CL_DEVICE_IMAGE_SUPPORT= 1
CL_DEVICE_LOCAL_MEM_SIZE= 32768
CL_KERNEL_WORK_GROUP_SIZE= 1024
Input size is 16384 items
Executing OpenCL kernel...
MEM OBJECT INFO:
MEM_SIZE= 16384
IMAGE INFO:
CL_IMAGE_ELEMENT_SIZE= 4
CL_IMAGE_ROW_PITCH= 128
CL_IMAGE_WIDTH= 32
ERROR: Failed to execute kernel
In that test I was trying to run it with a 128*128 array (16KByte ). As I cut the data into 128 workgroups, it should not be a problem of data but I am certainly missing something.
Thank you for your help,
Chris