Should clGet*Info increment the reference to the cl_* type?

I’m not sure if this is a bug in the OpenCL C++ bindinsg or in the Apple implementation I’m using it on. Currently, this code sometimes crashes (race condition) at program shutdown:


#include <vector>

#define __CL_ENABLE_EXCEPTIONS
#include "cl.hpp"

using namespace std;

int main(int, char *[])
{
  cl::Context context(CL_DEVICE_TYPE_ALL);
  
  vector<cl::Device> devices = context.getInfo<CL_CONTEXT_DEVICES>();
  
  cl::CommandQueue queue(context, devices[0], 0);

  printf("Context reference count = %d
", context.getInfo<CL_CONTEXT_REFERENCE_COUNT>());

  cl::Context context_copy(context);

  printf("Context reference count = %d
", context.getInfo<CL_CONTEXT_REFERENCE_COUNT>());

  cl::Context ccontext = queue.getInfo<CL_QUEUE_CONTEXT>(); 

  printf("Context reference count = %d
", context.getInfo<CL_CONTEXT_REFERENCE_COUNT>());

  return 0;
}

I’ve tracked the problem down to the fact that clReleaseContext is being called one too many times. The output of the program is as follows:


Context reference count = 2
Context reference count = 3
Context reference count = 3

That is, the copy construction of a cl::Context is properly calling clRetainContext to increment the reference count, but the queue.getInfo<CL_QUEUE_CONTEXT>() call is not. This causes an extra clReleaseContext call to occur, which crashes the Apple implementation sometimes (which it shouldn’t, but that’s a separate issue).

So should a clGetInfo call that returns another cl_ type automatically increment that reference. I haven’t found anything in the standard stating one way or the other.

I think you are correct this is a bug in the C++ bindings.

I am still a little confused why reference counts are the values you are printing. Running on our implementation we get:

Context reference count = 1
Context reference count = 2
Context reference count = 2

I’m assuming that they are starting from different initial base values.

This example does also crash on our implementation and is due to the bug in cl.hpp.

I’ll start working on a fix then to submit. It doesn’t look simple since the GetInfoHelper is used on both POD and the cl_* types. Also, retain and release are protected methods.

FYI, I added the following two more printfs about the reference counts. It appears creating the CommandQueue increments the reference count of the context. Does the AMD implementation have an alternative mechanism to keep the cl_context around if there’s a command queue that needs it?

The following code:


#include <vector>

#define __CL_ENABLE_EXCEPTIONS
#include "cl.hpp"

using namespace std;

int main(int, char *[])
{
  cl::Context context(CL_DEVICE_TYPE_ALL);
 
  vector<cl::Device> devices = context.getInfo<CL_CONTEXT_DEVICES>();
 
  cl::CommandQueue queue(context, devices[0], 0);

  printf("Context reference count = %d
", context.getInfo<CL_CONTEXT_REFERENCE_COUNT>());

  cl::Context context_copy(context);

  printf("Context reference count = %d
", context.getInfo<CL_CONTEXT_REFERENCE_COUNT>());

  cl::Context ccontext = queue.getInfo<CL_QUEUE_CONTEXT>();

  printf("Context reference count = %d
", context.getInfo<CL_CONTEXT_REFERENCE_COUNT>());

  return 0;
}

Outputs the following on the Apple implementation:


Context reference count = 1
Context reference count = 1
Context reference count = 2
Context reference count = 3
Context reference count = 3

I think the issue is the use of the vector assign operator in get*Info helper function. One possible fix is to replace this with a for loop and construct from the wrapper, forcing a increment in the ref count. I have not had time to try it yet.

The problem isn’t limited to the getInfo overload for vector returns. As the problem occurred in my example for with the queue.getInfo<CL_QUEUE_CONTEXT>() template method which returns a straight cl::Context. This method eventually drops into this template helper struct:


// GetInfo help struct
template <typename Functor, typename T>
struct GetInfoHelper
{
    static cl_int
    get(Functor f, cl_uint name, T* param)
    {
        return f(name, sizeof(T), param, NULL);
    }
};

What is needed is a specialization of this for anything that inherits from detail::Wrapper, something like the following:


// Specialized for any of the Wrapper classes
template <typename Func, typename T>
struct GetInfoHelper<Func, Wrapper<T> >
{
    static cl_int
    get(Func f, cl_uint name, Wrapper<T>* param)
    {
      cl_uint err = f(name, sizeof(Wrapper<T>), param, NULL);
      if (err != CL_SUCCESS) {
        return err;
      }

      return param->retain();
    }
};

Two problems with this though:
[ul]
I can’t get the compiler to recognize that it should fall into that specialization when getting passed a cl::Context object.
It calls the retain method of detail::Wrapper, which is currently protected, and probably should remain so.
[/ul]

So I’ve got an initial proof of concept fix:


diff --git a/cl.hpp b/cl.hpp
index 854c688..84d9f9b 100644
--- a/cl.hpp
+++ b/cl.hpp
@@ -1109,9 +1109,10 @@ class Wrapper
 {
 protected:
     typedef T cl_type;
-    cl_type object_;
 
 public:
+    cl_type object_;
+
     Wrapper() : object_(NULL) { }
 
     ~Wrapper()
@@ -1689,6 +1690,24 @@ public:
     }
 };
 
+namespace detail {
+// Specialized for cl_context
+template <typename Func>
+struct GetInfoHelper<Func, cl::Context>
+{
+    static cl_int
+    get(Func f, cl_uint name, cl::Context* param)
+    {
+      cl_uint err = f(name, sizeof(cl::Context), param, NULL);
+      if (err != CL_SUCCESS) {
+        return err;
+      }
+      
+      return ReferenceHandler<cl::Context::cl_type>::retain(param->object_);
+    }
+};
+}
+
 /*! \class Event
  * \brief Event interface for cl_event.
  */

I don’t like this for two reasons. It will require a lot of code duplication, though I can refactor the GetInfoHelper specialization into a macro. The tricky part was that the specialization has to be declared after cl::Context was fully specified because of the sizeof operator being used.

The next reason I don’t like this is I had to make the object_ member of detail::Wrapper public. This seemed slightly less egregious then making the retain method public. Though what would you think about making a cl::Context (and friends) constructor that simply takes the cl_type it is associated with and calls retain on it? This adds a constructor to the public interface, though theoretically this could have utility outside of the C++ bindings. For example, if using another third party library returns cl_* types, then the C++ bindings could be used to wrap around those returned values (properly calling the clRetain* function).