CL_INVALID_COMMAND_QUEUE error

I get CL_INVALID_COMMAND_QUEUE ( -36 ) while trying to write to buffer. I have no clue why and how to debug this issue. Any ideas? I tried to export CL_LOG_ERRORS=stdout in order to see a more meaningful error message, but it didn’t appear. Do I have to call some extra functions for the LOG to appear?

Here is the code:

// GLOBALS

	extern cl::Context context;
	extern std::vector<cl::Device> devices;
	extern cl::CommandQueue queue;

	int initializeCL();
...

	int initializeCL()
	{
		cl_int status = 0;

		/*
		 * Have a look at the available platforms and pick either
		 * the AMD one if available or a reasonable default.
		 */
		vector<cl::Platform> platforms;
		status = cl::Platform::get(&platforms);
		if(status != CL_SUCCESS)
		{
			cerr << "Error: Platform::get() failed (" << status << ")
";
			return 1;
		}

		vector<cl::Platform>::iterator i;
		if(platforms.size() > 0)
		{
			for(i = platforms.begin(); i != platforms.end(); ++i)
			{
				if(!strcmp((*i).getInfo<CL_PLATFORM_VENDOR>(&status).c_str(), "Advanced Micro Devices, Inc."))
				{
					break;
				}
			}
		}
		if(status != CL_SUCCESS)
		{
			cerr << "Error: Platform::getInfo() failed (" << status << ")
";
			return 1;
		}

		/* 
		 * If we could find our platform, use it. Otherwise pass a NULL and get whatever the
		 * implementation thinks we should be using.
		 */
		cl_context_properties cps[3] = { CL_CONTEXT_PLATFORM, (cl_context_properties)(*i)(), 0 };

		// Create an OpenCL context
		cl::Context context(CL_DEVICE_TYPE_CPU, cps, NULL, NULL, &status);
		if (status != CL_SUCCESS) {
			cerr << "Error: Context::Context() failed (" << status << ")
";
			return 1;
		}

		// Detect OpenCL devices
		vector<cl::Device> devices = context.getInfo<CL_CONTEXT_DEVICES>();
		if (status != CL_SUCCESS) {
			cerr << "Error: Context::getInfo() failed (" << status << ")
";
			return 1;
		}
		if (devices.size() == 0) {
			cerr << "Error: No device available
";
			return 1;
		}

		// Create an OpenCL command queue
		cl::CommandQueue queue(context, devices[0], 0, &status);
		if (status != CL_SUCCESS)
		{
			cerr << "Error: CommandQueue::CommandQueue() failed (" << status << ")
";
			return 1;
		}

		return 0;
	}

...

	template <typename T> Vector<T>::Vector(unsigned int size_):
		size(size_),
	{
		cl_int status = 0;

		// allocate value in GPU memory
	    value = cl::Buffer(context, 
		                   CL_MEM_READ_WRITE,
		                   sizeof(T) * size,
		                   &status);

	    if(status != CL_SUCCESS)
		{
                cerr << "Error: cl::Buffer failed. (" << status << ")
";
	        exit(1);
		}
	}

...

	template <typename T> void Vector<T>::write(const vector<T> &source)
	{
		if (source.size() == size)
		{
			cl_int status = 0;

			/* Write data to buffer */
			status = queue.enqueueWriteBuffer(value,
			                                  CL_TRUE,
			                                  0,
			                                  size * sizeof(T),
			                                  &source.front(),
			                                  NULL,
			                                  NULL);

			if(status != CL_SUCCESS)
			{
                	    	cerr << "Error queue::enqueueWriteBuffer failed. (" << status << ")
";
				exit(1);
			}
		}
		else
		{
			cerr << "Error: write to vector failed. Sizes do not match
";
			exit(1);
		}
	}

...

int main()
{
...
	Vector<cl_int> vec1(10);
	vector<cl_int> input(10, 3);
	vec1.write(input);
...
}

Output:
Error queue::enqueueWriteBuffer failed. (-36)

ps: I forgot to mention that I have initialization routine in main():

int main()
{
    if(initializeCL() == 1)
		exit(1);

...
   Vector<cl_int> vec1(10);
   vector<cl_int> input(10, 3);
   vec1.write(input);
...
}

pps: queue initialization seems to pass without error messages, what causes problem is this:

vec1.write(input);

I don’t know if you can send in a std::Vector to an OCL buffer directly like this, you may have to use arrays to ensure that the memory you are trying to write is both contiguous and accessible. Does it work if you copy the vector to an array and copy that instead?


memcpy( dstArray, &source[0], sizeof(T)*source.size() );

Otherwise, InitializeCL() function looks fine (obvious question: you ran it successfully before vec1.write(), correct?)

I like how you are hiding the OCL layer with Vectors though, it’d be nice if they worked together seamlessly.

I rewrote the Vector::write() and it didn’t help :frowning: … also placing &source[0] instead of dstArray didn’t work… Yes, I run initializeCL(); successfully before vec1.write()…

	template <typename T> void Vector<T>::write(const vector<T> &source)
	{
		if (source.size() == size)
		{
			cl_int status = 0;
			
			T *dstArray = new T[source.size()];
			memcpy( dstArray, &source[0], sizeof(T)*source.size() );
			
			/* Write data to buffer */
			status = queue.enqueueWriteBuffer(value,
			                                  CL_TRUE,
			                                  0,
			                                  size * sizeof(T),
			                                  dstArray,
			                                  NULL,
			                                  NULL);

			if(status != CL_SUCCESS)
			{
    	    	    	    	cerr << "Error queue::enqueueWriteBuffer failed. (" << status << ")
";
				exit(1);
			}
		}
		else
		{
			cerr << "Error: write to vector failed. Sizes do not match
";
			exit(1);
		}
	}

Is there a way to get more details on the error?
export CL_LOG_ERRORS=stdout - didn’t provide any information in console…

you can try passing events to the enqueueWriteBuffer(…) parameters and then querying the status of those parameters, but that isn’t always super useful tbh…

Its strange that you’re getting CL_INVALID_COMMAND_QUEUE returned from enqueueWriteBuffer(…), the documentation doesn’t mention that as a possible error code from that function.

Hm… wish I could be of more help. I was thinking it may be larger than the maximum buffer size, but then you’d get a different error code.

I don’t quite understand what the source code is doing. On one hand there’s a global variable called ‘queue’ which as far as I can see is never initialized, and in the other hand there’s a local variable inside the function initializeCL() that is initialized and then automatically deleted at the end of the function.

In other words, this line is not doing what you think it’s doing:

cl::CommandQueue queue(context, devices[0], 0, &status);

It initializes the local variable ‘queue’ and does nothing to the global variable of the same name. The uninitialized global variable ‘queue’ is the one that function Vector<T>::write() is accessing.

Its strange that you’re getting CL_INVALID_COMMAND_QUEUE returned from enqueueWriteBuffer(…), the documentation doesn’t mention that as a possible error code from that function.

Which documentation are you referring to? Both the man page and the OpenCL 1.1. specification list CL_INVALID_COMMAND_QUEUE as the first error that may be returned by clEnqueueWriteBuffer.

david.garcia - you are correct (as always :slight_smile: )

  1. I changed those lines to:

context = cl::Context(CL_DEVICE_TYPE_CPU, cps, NULL, NULL, &status);
devices = context.getInfo<CL_CONTEXT_DEVICES>();
queue = cl::CommandQueue(context, devices[0], 0, &status);

and Vector’s constructor, write-method are as in my first post…

but now I get another error: CL_INVALID_MEM_OBJECT (-38)?

2 more questions:

  1. Is there such thing as “export CL_LOG_ERRORS=stdout” or not? and if there is - how is it supposed to work?

  2. do I have to provide cl_int to my Vector class, like this: Vector<cl_int> vec1(10); or it can be just regular int? what’s the difference between them, if there is any?

thank you in advance!

Where do you get the CL_INVALID_MEM_OBJECT error? Is it in the write() function?

Is there such thing as “export CL_LOG_ERRORS=stdout” or not? and if there is - how is it supposed to work?

What you can do is pass a callback function pointer when you create the CL context. That function will be called every time there’s an error. See the documentation for clCreateContext(). You can make that function call into printf() if you want.

what’s the difference between them, if there is any?

cl_int is guaranteed to be a 32-bit signed integer in two’s complement, while ‘int’ could be almost anything, including a 16-bit integer. It is a good practice to always use cl_xxx types if you plan to pass them to OpenCL.

yes, its on the same place - in write function…

Could you check the value of variable Vector::value? That seems to be the only memory object passed to enqueueWriteBuffer().

How should I check it? I tried to “cout <<” it, but it refused…

then I tried to put:

	template <typename T> Vector<T>::Vector(unsigned int size_):
		size(size_),
	{
		cl_int status = 0;

		// allocate value in GPU memory
	    value = cl::Buffer(context, 
		                   CL_MEM_READ_WRITE,
		                   sizeof(T) * size,
		                   &status);

	    if(status != CL_SUCCESS)
		{
            cerr << "Error: cl::Buffer failed. (" << status << ")
";
	        exit(1);
		}
cout <<"getInfo " << value.getInfo<CL_MEM_TYPE>() << endl;

	}

and in write:

	template <typename T> void Vector<T>::write(const vector<T> &source)
	{
		if (source.size() == size)
		{
			cl_int status = 0;
cout <<"getInfo 2: " << value.getInfo<CL_MEM_TYPE>() << endl;
			/* Write data to buffer */
			status = queue.enqueueWriteBuffer(value,
			                                  CL_TRUE,
			                                  0,
			                                  size * sizeof(T),
			                                  &source[0],
			                                  NULL,
			                                  NULL);

			if(status != CL_SUCCESS)
			{
    	    	cerr << "Error queue::enqueueWriteBuffer failed. (" << status << ")
";
				exit(1);
			}
		}
		else
		{
			cerr << "Error: write to vector failed. Sizes do not match
";
			exit(1);
		}
	}

I don’t really know what I should get, but I was expecting to get the same result, however it was not the case, I got:

getInfo 4294967259
getInfo 2: 32741

what is the problem here?

By “check” I mean make sure that the CL memory object that is wrapped by the C++ class is not NULL. I.e. “value.object_ != NULL”. You should be able to access that data from the debugger.

The value returned by that first “getInfo” suggests that the “value” variable is not initialized, which would explain why it is returning INVALID_MEM_OBJECT.

By “check” I mean make sure that the CL memory object that is wrapped by the C++ class is not NULL. I.e. “value.object_ != NULL”. You should be able to access that data from the debugger.

it doesn’t compile, since object_ is protected…

The value returned by that first “getInfo” suggests that the “value” variable is not initialized, which would explain why it is returning INVALID_MEM_OBJECT.

  1. why is it not initialized?! it passed:
	    if(status != CL_SUCCESS)
		{
            cerr << "Error: cl::Buffer failed. (" << status << ")
";
	        exit(1);
		}
  1. where is it initialized if not in constructor? (it must happen somewhere, since in write it is already initialized, right?)

Here is the solution:

		// allocate value in GPU memory
	    value = cl::Buffer(context, 
		                   CL_MEM_READ_WRITE,
		                   sizeof(T) * size,
	                       NULL,
		                   &status);

I missed a NULL in constructor…

But how did it compile? I need to have a look at the C++ bindings to understand this one.