In openCL you can specify the amount of local memory you want to allocate in a kernel from host code by specifing the amount of memory to allocate in a parameter for local memory with the command
Code :
clSetKernelArg(myKernel, 3, localHeight * localWidth * sizeof(float), NULL);
where the kernel looks like
Code :
__kernel void matrixMul_gpu(
	__global float* A, __global float* B, __global float* C,
	__local float * As, __local float * Bs,
	unsigned int HeightA, unsigned int WidthB, unsigned int WidthAHeightB
This works well if I want to create 1D arrays in local memory, but what if I want to make As and Bs 2D arrays where their sizes are dynamically specified from the host code?