Hi all!
I already started a thread in NVIDIA forums, but it looks like noone is interested in things different from CUDA… Here is my problem again:
Some month ago I started working on a fractal raytracer in C++/OpenCL. I already met a lot of bugs in the NVIDIA OpenCL compiler (access violations when declaring variables without using them) but I was Always able to find some workaround. This time it looks a bit more serious: In my raytracer I have to select the color of the nearest shape. I implemented this using a switch, but I always get a CL_OUT_OF_RESOURCES error when reading the output buffer (CL_INVALID_COMMAND_QUEUE if I call clFinish() before). This happens only with NVIDIA GPUs, works correctly with AMD GPU and Intel CPU. This is the important part of the code:
TracerOut Trace(CameraOut in, SceneParams params)
{
float dist[2];
TracerOut out;
Mandelbulb1_OO Mandelbulb1_oo = Mandelbulb1_Object(in, params);
dist[0] = distance(Mandelbulb1_oo.intersection, in.origin);
Mandelbulb2_OO Mandelbulb2_oo = Mandelbulb2_Object(in, params);
dist[1] = distance(Mandelbulb2_oo.intersection, in.origin);
uint nearestId = 0;
float nearestDist = 10000000.0f;
for (uint i = 0; i < 2; i++)
{
if (dist[i] < nearestDist)
{
nearestDist = dist[i];
nearestId = i;
}
}
// Trick needed to avoid access violation bug
Mandelbulb1_SO Mandelbulb1_so;
Mandelbulb1_so.color.x = 0.0f;
Mandelbulb2_SO Mandelbulb2_so;
Mandelbulb2_so.color.x = 0.0f;
switch (nearestId)
{
case 0:
Mandelbulb1_so = Mandelbulb1_Shader(in, Mandelbulb1_oo, params);
out.color = Mandelbulb1_so.color;
break;
case 1:
Mandelbulb2_so = Mandelbulb2_Shader(in, Mandelbulb2_oo, params);
out.color = Mandelbulb2_so.color;
break;
default:
out.color = (float4)(0.0f, 0.0f, 0.0f, 0.0f);
break;
}
return out;
}
I imagined that the switch construct can cause the problem, so i tried with simple if’s:
TracerOut Trace(CameraOut in, SceneParams params)
{
//...
Mandelbulb1_SO Mandelbulb1_so;
Mandelbulb1_so.color.x = 0.0f;
Mandelbulb2_SO Mandelbulb2_so;
Mandelbulb2_so.color.x = 0.0f;
out.color = (float4)(0.0f, 0.0f, 0.0f, 0.0f);
Mandelbulb1_so = Mandelbulb1_Shader(in, Mandelbulb1_oo, params);
Mandelbulb2_so = Mandelbulb2_Shader(in, Mandelbulb2_oo, params);
if (nearestId == 0)
out.color = Mandelbulb1_so.color;
if (nearestId == 1)
out.color = Mandelbulb2_so.color;
return out;
}
And I still have the same problem. Removing one or both the if’s solves the problem:
TracerOut Trace(CameraOut in, SceneParams params)
{
//...
out.color = (float4)(0.0f, 0.0f, 0.0f, 0.0f);
Mandelbulb1_so = Mandelbulb1_Shader(in, Mandelbulb1_oo, params);
Mandelbulb2_so = Mandelbulb2_Shader(in, Mandelbulb2_oo, params);
out.color = Mandelbulb1_so.color;
if (nearestId == 1)
out.color = Mandelbulb2_so.color;
return out;
}
Removing the declaration of the structs also lets it run fine:
TracerOut Trace(CameraOut in, SceneParams params)
{
//...
switch (nearestId)
{
case 0:
out.color = Mandelbulb1_Shader(in, Mandelbulb1_oo, params).color;
break;
case 1:
out.color = Mandelbulb2_Shader(in, Mandelbulb2_oo, params).color;
break;
default:
break;
}
return out;
}
But of course this is not what I want. I know that conditionals are very bad for GPUs, but at the moment I don’t have other solutions (someone has suggestions? ), optimization will come later. This should be supposed to work so I believe this is a bug in the NVIDIA OpenCL driver, right? Anyone had similar problem? Any fix coming?
I tried to run my program on different PCs. I can run it without problems on the fallowing devices:
[ul]
[li]CPU Intel i7 2600K
[/li][li]CPU Intel i7 920
[/li][li]CPU Intel i7 2620M
[/li][li]GPU Intel HD Graphics 3000
[/li][li]GPU AMD HD 6470M
[/li][/ul]
I get the CL_OUT_OF_RESOURCES / CL_INVALID_COMMAND_QUEUE errors on:
[ul]
[li]GPU NVIDIA GTX 680 (EVGA) 320.18
[/li][li]GPU NVIDIA GTX 560 Ti OC (Gigabyte) 320.18
[/li][li]GPU NVIDIA GTX 470 (Zotac) 320.18
[/li][/ul]
Last remark: some parts of the code my look bad written… This is because I’m not writing directly the OpenCL code. I’m writing a program that assembles OpenCL scripts dynamically and then runs them.
Thank you!
Mattia.