constant memory issue

hi, i m getting a very strange issue with constant memory lately.

My code work fine if a int buffer ( 2 elements ) is in global mem (__global) but when i swap it to (__constant) the result become completely wrong.

The weird stuff is, i tried not to use this buffer in the kernel and the result is still wrong, with __global it works fine.

i dont really get it, why this space qualifier is blowing up my code even if i dont use the buffer.

here is the kernel signature PTX code :

.entry func(
	.param .align 4 .b8 func_param_0[52],
	.param .u32 func_param_1,
	.param .u32 func_param_2,
	.param .u32 .ptr .global .align 4 func_param_3,
	.param .u32 .ptr .global .align 64 func_param_4,
	.param .u32 .ptr .global .align 32 func_param_5,
	.param .u32 .ptr .global .align 16 func_param_6,
	.param .u32 .ptr .global .align 4 func_param_7,
	.param .u32 .ptr .const .align 4 func_param_8,
	.param .u32 .ptr .global .align 16 func_param_9,
	.param .u32 .ptr .global .align 16 func_param_10,
	.param .u32 .ptr .global .align 16 func_param_11,
	.param .u32 .ptr .global .align 4 func_param_12,
	.param .u32 .ptr .global .align 4 func_param_13,
	.param .u32 .ptr .global .align 4 func_param_14,
	.param .u32 .ptr .global .align 1 func_param_15,
	.param .u32 .ptr .global .align 4 func_param_16,
	.param .u32 .ptr .global .align 4 func_param_17,
	.param .u32 .ptr .global .align 16 func_param_18,
	.param .u32 .ptr .global .align 4 func_param_19,
	.param .u32 .ptr .global .align 16 func_param_20,
	.param .u32 .ptr .global .align 4 func_param_21,
	.param .u32 .ptr .global .align 16 func_param_22,
	.param .u32 .ptr .global .align 16 func_param_23
)

func_param_8 is const and mess completely the code even without using it.

if anyone has an idea ? i am using a gForce 560Ti.

ty

Roger512

Can you post your OpenCL kernel code, and also show us how you’re setting up your buffers?

make sense,

I don’t know why your talking about cudaMemcpyTosymbol() here, it’s ptx from opencl code.

It’s a glitch in nvidia opencl implementation, my code works with ATI cards.

only thing i can see so far is that there could be problem because it is in the middle of your args. Have you tried putting the const arg to the end of the arguments list? Then the entrance points to your memory won’t be switched from global to const and back.

Doesn’t work, i tried to put it at the begining and the end it does the same thing.

maybe you’re using too much constant memory. It’s a very small area (with no performance benefits).