The dot function (could not link to sdk definition due to my newbie forum status) can take in either single values or vectors (in the programming term). If you are getting in a input matrix and vector that are of arbitrary length you will have to at least break them down to the biggest possible vector-components that are possible. So as far as I know you can not do the dot function in one operation on arbitrarily sized inputs. You’ll probably want to make float4 vectors for each thread.
I am pretty new to this whole thing so there might be some inaccuracies.
//printf(“Greetings host from work-item [%03d]”, i);
}
I think it would look something like that though I have not tested to see if this indeed works.
EDIT: This was a horrible example by me! The intent was to show how vectors work, but the actual matrix computation doesn’t make much sense. What the code above does is compute the dotproduct of the each 4 element segment of the matrix array with the first 4 elements of the input vector.