Undefined reference errors with image2d functions

Not sure if this is an implementation bug or a problem on my end, and I’m stuck, so to the forums!

I’m still making some simple benchmarking functions for OpenCL, working on image samplers. The current mini kernel-in-question copies an image to a buffer iteratively (yes i know it would be more efficient to copy one pixel per work item, this is to test instruction latency):


__constant sampler_t sampler =	CLK_NORMALIZED_COORDS_TRUE |
							CLK_ADDRESS_REPEAT         |
							CLK_FILTER_NEAREST;
__kernel
void read_imagei_test( __global image2d_t inputImage, __global int4 *result_data ) 
{ 
	int rows = get_image_width(inputImage);
	int cols = get_image_height(inputImage);
	int2 coords;
	for( int x=0; x<rows ; x++ ) 
	{ 
		for( int y=0; y<cols ; y++ ) 
		{ 
			coords.x = x;
			coords.y = y;
			result_data[x*rows + y] = read_imagei(inputImage,sampler,coords); 
		} 
	} 
} 

disregarding the confusion between “__constant” and “const” that ATI creates, which is currently under discussion in another topic…

I can compile this fine within ATI Stream Kernel Analyzer, but in program, i get the following errors:


C:\Users\agimenez\AppData\Local\Temp\OCL3DB9.tmp.obj:fake:(.text+0xf): undefined reference to `__get_image_width_image2d'
C:\Users\agimenez\AppData\Local\Temp\OCL3DB9.tmp.obj:fake:(.text+0x24): undefined reference to `__get_image_width_image2d'
C:\Users\agimenez\AppData\Local\Temp\OCL3DB9.tmp.obj:fake:(.text+0x31): undefined reference to `__get_image_width_image2d'
C:\Users\agimenez\AppData\Local\Temp\OCL3DB9.tmp.obj:fake:(.text+0x68): undefined reference to `__read_imagei_image2d2i32'
C:\Users\agimenez\AppData\Local\Temp\OCL3DB9.tmp.obj:fake:(.text+0x78): undefined reference to `__get_image_width_image2d'
C:\Users\agimenez\AppData\Local\Temp\OCL3DB9.tmp.obj:fake:(.text+0xa4): undefined reference to `__get_image_width_image2d'
C:\Users\agimenez\AppData\Local\Temp\OCL3DB9.tmp.obj:fake:(.text+0xb4): undefined reference to `__get_image_width_image2d'
C:\Users\agimenez\AppData\Local\Temp\OCL3DB9.tmp.obj:fake:(.text+0xe8): undefined reference to `__read_imagei_image2d2i32'
C:\Users\agimenez\AppData\Local\Temp\OCL3DB9.tmp.obj:fake:(.text+0xf8): undefined reference to `__get_image_width_image2d'
C:\Users\agimenez\AppData\Local\Temp\OCL3DB9.tmp.obj:fake:(.text+0x105): undefined reference to `__get_image_width_image2d'

no idea where to go from these errors… is ATI the issue or am i missing something else?

Yeah, those error messages don’t look very friendly. Have you tried declaring the image as “image2d_t inputImage” without “__global”?

Unrelated to your problem, haven’t you switched “rows” and “cols”? And isn’t there something wrong with “result_data[x*rows + y]”?

There is also an error with the values in the sampler that you are using with read_imagei.
The sampler you are using with read_imagei in your kernel is defined to be: (CLK_NORMALIZED_COORDS_TRUE | CLK_ADDRESS_REPEAT | CLK_FILTER_NEAREST).

The spec states the following for read_imagei and read_imageui:

"The read_image{i|ui} calls support a nearest filter only. The filter_mode specified in sampler
must be set to CLK_FILTER_NEAREST; otherwise the values returned are undefined.

Furthermore, the read_image{i|ui} calls that take integer coordinates must use a sampler with
normalized coordinates set to CLK_NORMALIZED_COORDS_FALSE and addressing mode set to
CLK_ADDRESS_CLAMP_TO_EDGE, CLK_ADDRESS_CLAMP or CLK_ADDRESS_NONE; otherwise the values returned are undefined."

You cannot use CLK_NORMALIZED_COORDS_TRUE and CLK_ADDRESS_REPEAT which you are using. In the code it looks like you are passing un-normalized coords but the sampler is programmed in your code to use normalized coords.

Thanks for the replies!

Tried this, but getting the same errors as before, again, only in program.

absolutely right, i hacked together the loops to get a test kernel going, should be fine now.

Also true, I was messing with different values in compilation but these should be un-normalized coordinates

The latest kernel with the same errors as listed above:


__constant sampler_t sampler =	CLK_NORMALIZED_COORDS_FALSE |
						CLK_ADDRESS_CLAMP         |
						CLK_FILTER_NEAREST;
__kernel
void read_imagei_test( image2d_t inputImage, __global int4 *result_data ) 
{ 
	int2 coords;
	int cols = get_image_width();
	int rows = get_image_height();
	for( int x=0; x<cols; x++ ) 
	{ 
		for( int y=0; y<rows; x++ ) 
		{ 
			coords.x = x;
			coords.y = y;
			result_data[x*cols+y] = read_imagei(inputImage,sampler,coords); 
		} 
	} 
} 

Given those error messages I’m inclined to think that the problem is in your host program, not in your kernel. They look more like linker errors than compiler errors. Perhaps you are trying to use this kernel on a device with no image support?

Andrew, if that was the case, wouldn’t you expect parse errors around keywords like __constant, __kernel, __global, etc?

Andrew, looks like that was the issue. I was under the impression that I’d get an error from clCreateImage2D if image support was not available, but because I have several devices with different capabilities, things get confused…

I’m creating an OCL device for all available hardware (one CPU, two GPU), and if I disable the CPU device (which doesn’t have image support), it compiles, but gives me an access violation when I run it. This looks like a problem on my end with buffer sizes, will post later if it turns out to be something more involved.

So the big problem was that I have three devices (all held within a private class), two of which have image support and various extensions, and one of which has no image support and more limited extensions. Looks like I’ll have to run the kernels device-specifically for “special” instructions like images or atomic functions.

Thanks all!

ok, still playing with this memory violation error. Here’s how i set up my image and output buffer:


	// create image for image function tests
	size_t result=0;
	unsigned char header [54];

	FILE *input = fopen("lena.bmp", "rb");
	
	// read and store header
	result = fread(header,1,54,input); 

	// move to start of data
	fseek (input, 54, SEEK_SET);

	unsigned char tab [512][512];

	//read data from bmp file
	int i, j;
	for (i=0; i < 512; i++)
	{
		for (j=0; j < 512; j++)
		{
			result = fread(&tab[i][j],1,3,input );
		}
	}

	fclose(input);

	// Allocate the OpenCL buffer memory objects for source and result on the device GMEM
	size_t width = 512;
	size_t height = 512;
	size_t rowpitch = 0;

	cl_image_format format;
	format.image_channel_order = CL_RGBA;
	format.image_channel_data_type = CL_UNSIGNED_INT8;

	cl_mem_flags flags;
	flags = CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY;

	cl_mem inImage;
	inImage = clCreateImage2D (	clState->ctx, 
															flags, 
															&format, 
															width, 
															height, 
															rowpitch, 
															tab, 
															&err); 

	outBuff = clCreateBuffer(clState->ctx,  CL_MEM_READ_WRITE, width*height*sizeof(cl_uint), NULL, &err);

and i’m running the kernel only ONCE, with parameter

 size_t globalWorkSize[] = {1}; 

and when I call my clWaitForEvents with the kernel queued up, i get a memory access violation.

unsigned char tab [512][512];

This is only allocating 512x512 bytes of memory (one byte per pixel). The code is then trying to read 4 bytes per pixel (RGBA). You need more memory.

thanks again David, that was the issue. for anyone googling reading a bmp into an opencl kernel (as I have), here’s the working code:



	size_t result=0;
	unsigned char header [54]; // 54 bytes for header
	
	// dimensions
	size_t width = 512;
	size_t height = 512;
	size_t rowpitch = 0;

	FILE *input = fopen("lena.bmp", "rb");
	
	// store and skip header
	result = fread(header,1,54,input); 
	fseek (input, 54, SEEK_SET);

	unsigned int* pixels = new unsigned int[width*height];

	result = fread( pixels, 1, sizeof(int)*width*height, input );

	fclose(input);

	cl_image_format format;
	format.image_channel_order = CL_RGBA;
	format.image_channel_data_type = CL_UNSIGNED_INT8;

	cl_mem_flags flags;
	flags = CL_MEM_USE_HOST_PTR;

	cl_mem inImage;
	inImage = clCreateImage2D (	clContext, flags, &format, width, height, rowpitch, pixels, &err 	);