clBuildProgram returns CL_INVALID_BINARY for double data types on a GTX 480

Today I tried to compile my program on a CUDA machine:

  •             platform: NVIDIA CUDA [0]  
    
  •              version: OpenCL 1.1 CUDA 4.2.1   
    
  •               device: GeForce GTX 480 [0]    
    
  •        compute units: 15        
    
  •        global memory: 1535MB         
    
  •     max. buffer size: 383MB   
    
  • max. work group size: 1024
    
  • floating point precision: double

The opencl program won’t build with this kernel:

#pragma OPENCL EXTENSION cl_khr_fp64: enable

__kernel void test(__global double *a, __global double *b)
{
int i = get_global_id(0);
a[i] = 99.9;
b[i] = a[i] + 77.7;
}

But it runs fine with this one:

// #pragma OPENCL EXTENSION cl_khr_fp64: enable

__kernel void test(__global float *a, __global float *b)
{
int i = get_global_id(0);
a[i] = 99.9f;
b[i] = a[i] + 77.7f;
}

Both kernels are build and executed correctly on a second machine with a HD7970. It gets even stranger: The following kernel will build and execute on the GTX 480 machine:

#pragma OPENCL EXTENSION cl_khr_fp64: enable

__kernel void test(__global double *a, __global double *b)
{
int i = get_global_id(0);
a[i] = 99.9;
b[i] = 77.7;
}

The only difference to the first kernel is that I’ve removed a[i] in the last line. It turns out, that the program won’t build if I use double data type and try to read from either array a or b. But there is no problem with writing to both arrays! Can someone think of the reason? The driver is 304.54 and CUDA version is 5.0.35. I also tried the same code on another machine with a GTX 480 but with driver version 319.37 and CUDA 5.5.22 and got exactly the same behavior.

I’ve added example code which reproduces the behavior. Can somebody try the code on a GTX 480 please?

It works fine on a GT650M with driver version 331.65 (OpenCL 1.1 CUDA 6.0.1)
Maybe you should try with the latest driver.

An update to newest driver version didn’t help. I’m experiencing the same problem on another machine with a GTX 285.
Probably Nvidia is neglecting OpenCL support for older cards. That’s sad …