Regarding creating a very big array within an OpenCL Kernel

For an OpenCL program intended to run on CPU(4 cores),2GB RAM,
Inside the kernel for each work item i need to make an array of 10000 * 2000 of type int.
Could not declare it just like int array[10000][2000] ; because of the big size.The Run Fails if tried.
Can anyone suggest me a way to make this work.
i.e. The kernel running on each of the 4 cores be able to create an int array of 10000 * 2000 size of its own.
I tried with global memory being passed as a kernel argument but it appears like kernels overwrite each other in the space or something like that (inconsistent outputs were seen on each run).
Any help will be greatly appreciated …at least how to handle the global memory without overwrite if that itself is the way to go :frowning:

Are you using clCreateBuffer()? As far I know, you can’t use 2D arrays: you have to unroll them into a 1D since you can’t have pointers to points in the kernels.

http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clCreateBuffer.html

You can create an OpenCL buffer which is large enough to hold the data from all arrays from all workgroups (I assume you are launching the kernel with 4 workgroups and single work-item per workgroup, since you want each kernel instance to be executed on a separate core). Then you can use the workgroup id from within the kernel to know which section of the global buffer to use, with something like

__global int *array = buffer + 10000*2000*get_workgroup_id(0)