CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR

hi all

i thought that gpus are kinda vector programming tools , but how that possible while,
when i get info
of gpu device with clGetDeviceInfo and option CL_DEVICE_PREFERRED_VECTOR_WIDTH_TYPE it return
only 1 for each TYPE { CHAR,INT,…}???
but when i get info of my cpu it return deferent, base on type.
for example
when i use:
clGetDeviceInfo(devices, CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR,
sizeof(char_width), &char_width, NULL);
char_width become 16 . (device is cpu type);

whereas
clGetDeviceInfo(devices, CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR,
sizeof(char_width), &char_width, NULL);
char_width become 1 .(device is gpu type);
it seems that cpu do better vector programming than gpu???
what is going on??

What kind of GPU are you using?

The “vector programming” comes from the fact, that buffers can be thought of as vectors of data and each work item is processing one of the vector elements. The vector width type refers to the type of each buffer element, which should not be a vector type (e.g. float4, int2, etc.) on NVIDIA GPUs. Most recent CPUs have some kind of SIMD instructions. For example SSE supports 128 Bit vector instructions, which is exactly 16 times a character.

hi

nvidia gforce gt 9600

suppose clgetdeviceinfo with option CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT return
16 for when device is cpu
my mean of vector programming is like this:
int16 a=(1,2,3,4,…,15);
int16 b=(1,2,3,4,…,15);
int16 c;
c=a+b;
well when clgetdeviceinfo with option CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT return
1 for when device is gpu(nvidia gforce gt 9600) how can i vector programm like above example while it return prefer width 1???

as i understand correct of your reply i can exactly write same code on kernel??
is it so??

You can use this. But you probably won’t benefit. Are you processing large arrays of int16 vectors or just one vector? If the latter is the case, you should do it on CPU anyway.

i have massive data ,in fact my datas are blocks with 16 int number in it. according to
this fact i need to use vector data in kernel.

The preferred vector width mostly has to do with memory accesses and register sizes. You’ll probably find on a gpu they all equate to 128 bit quantities - char16, short8, int4, float4, etc. And even if the gpu is scalar internally, they’ll probably be the same as on vector ones because of the heritage and the work-loads they are optimised for.

Whether using a specific size benefits your algorithm on given hardware is up to you to try and discover - there is no absolute answer.