Do not get correct result when increasing local size

Hello,

I’m trying to figure out why I do not get the correct result (simple matrix product) when I increase the local work-item size beyond 32, while the maximum work-item I have is 1024.

Any ideas ?
Thanks