input and output having different dimensions

Is is possible to have a ocl kernel taking as input two float arrays of length N and writing the results to one output array of dimension NxN? Thanks!

Yes you can.

Thanks, I just figured out the logic of the ND ranges, work elements, memory objects, etc.