Beginner, strategy on first OpenCL project

Hi,

as I am new to OpenCL I want to ask about the strategy about parallel the work.

I have a host allocated memory (stream).

So first CL_MEM_USE_HOST_PTR should be used because the memory of the stream on the host is allready allocated by a third party software.

The data looks like this:

unsigned char data[packet_count][1024]

packet_count is variable and can be 1-100.

I receive the data in a callback function as a pointer:

void data_in(unsigned char *data, uint32_t len)

As there are 3 groups on these packets first I will loop through them on the host to find out which
packet can be parallelized. This is done by some bits in the first byte of each packet.

So as example I will receive a len of 10*1024 bytes.
This means I have 10 packets.

Let asume after checking the group_bits I get this result:

Group 1:
	data[0][1024]
	data[3][1024]

Group 2:
	data[1][1024]
	data[2][1024]
	data[7][1024]
	data[9][1024]
	
Group 3:
	data[4][1024]
	data[5][1024]
	data[6][1024]
	data[8][1024]

As you can see the groups can be mixed.

So now OpenCL can run first Group 1, than Group 2 and at least Group 3.
Each group have to be done seperate because there is different init data for each group.

So how is the best to tell the __kernel what packets are should be taken?
The data is byte alligned so it can be handled by pointer but I don’t want to avoid memory data copy
to save time.

My first idea is to make it this way:
Group 1:

size_t global_work_size[1] = 2

Give the __kernel task a pointer to a array where are the indexes saved which packet should be used:

uchar packets[2] = { 0, 3 };

Group 2:

size_t global_work_size[1] = 4

indexes:

uchar packets[4] = { 1, 2, 7, 9 };

Group3:

size_t global_work_size[1] = 4

indexes:

uchar packets[4] = { 4, 5, 6, 8 };

The __kernel task can take the right packets by the index like this:

__kernel void packets_handling(__global uchar in[], __global uchar packets[], __global uchar init_data[])
{
	int gid = get_global_id(0);
	uchar packet = packets[gid];

	// pointer to packet
	uchar *packet_data = &in[packet];
}

Should this work or is there a better way for this?

As there is no “Edit” on the first post here a correction:

It should be:

__kernel void packets_handling(__global uchar in[], __global uchar packets[], __global uchar init_data[])
{
	int gid = get_global_id(0);
	uchar packet = packets[gid];
 
	// pointer to packet start
	uchar *packet_data = &in[packet * 1024];
}