C++ templated cl::Buffer creation problem on AMD platform

Hi.
I made a template function to simplify the creation of cl::Buffers
by passing the data type as a template argument:


template<typename T> 
	cl::Buffer MyOpenCLWrapper::CreateBuffer(const unsigned int numElements_, cl_mem_flags flag_) {
		const unsigned int numBytes = sizeof(T) * numElements_;	
		std::cout<<"Creating "<< typeid(T).name() <<" buffer with a length of "<<numBytes<<" bytes."<<std::endl;
		int bufferError = 0;
		cl::Buffer b = cl::Buffer(OpenCLClass::GetInstance().GetContext(), flag_, numBytes, &bufferError);
		if (bufferError != CL_SUCCESS) {
			std::cout<<"CL Buffer creation failed with error: "<<OpenCLClass::GetErrorCodeString(bufferError)<<std::endl;
			exit(-1);
		}
		return b;
	}

template<typename T> void MyOpenCLWrapper::WriteBuffer(cl::Buffer& buf_, const unsigned int numElements_, const bool flag_, T* dataIn_)
	{
		int err = 0;
		int size = numElements_*sizeof(T);
		std::cout<<"Writing buffer with "<<size<<" bytes to device."<<std::endl;
		err = m_Queue.enqueueWriteBuffer(buf_, flag_, 0, size, dataIn_);
		if (err != CL_SUCCESS) {
			std::cout<<"ERROR! Write buffer failed with error: "<<OpenCLClass::GetInstance().GetErrorCodeString(err)<<std::endl;
		}
	}

allowing to create Buffers by calling something like


cl::Buffer clBuf0 =  dev.CreateBuffer<float>(numElements, CL_MEM_READ_ONLY);
cl::Buffer clBuf1 =  dev.CreateBuffer<unsigned int>(numElements, CL_MEM_READ_WRITE);

This code has been tested and is working fine with a Nvidia 9600 GT.
Testing the same code on an AMD platform fails when writing to the buffer
with an INVALID_MEMORY_OBJECT error, although CreateBuffer(…) reports
CL_SUCCESS.
The problem is NOT the WriteBuffer function, as it works on AMD when
creating the Buffer object manually, i.e.


cl::Buffer clBuf1 =  cl::Buffer(OpenCLClass::GetInstance().GetContext(), CL_MEM_READ_ONLY, numElements*sizeof(float));

I am stuck with this. Any ideas what the problem could be?

Sorry. My fault. Pretty trivial error:
It was just a missing argument in the line:

cl::Buffer b = cl::Buffer(OpenCLClass::GetInstance().GetContext(), flag_, numBytes_, &bufferError);

Changing it to

cl::Buffer b = cl::Buffer(OpenCLClass::GetInstance().GetContext(), flag_, numBytes_, NULL, &bufferError);

solved it.