Is float point operation in OpenCL stochastic?

Hi guys,

I have a float point calculation code and sometimes it gives different results than what it gets in most runs.

Sometimes I use “cl-opt-disable” and sometimes I just use “” for compiling the kernel.

As far as I know, there should be no racing conditions.

My GPU card is NVIDIA GTX 9800/GTX 9800+. I think the compute capability is 1.1 or 1.0. The device version is 1.0.

Thanks!

No of course they aren’t stochastic - floating point operations obviously work in a deterministic manner otherwise they wouldn’t be much use would they? If they weren’t deterministic in a GPU, GPU’s wouldn’t even be useful for graphics let alone more.

That doesn’t mean compilation options might not alter the results if different instructions or instruction order is created as the associativity of floating point operations is data dependent. e.g. http://en.wikipedia.org/wiki/Floating_p … y_problems

But the only time i’ve seen ‘random’ results is with broken code. Possibly because of one or more of: incorrect or missing initialisation, boundary over-runs, race conditions, or some other bug.

Thanks Notzed. What is boundary over-runs?

Best regards,
Mingcheng

Hi notzed,

Is it possible that one processor in my card malfunctions?

Thanks!

Remember also that the order of floating point operations matters both within a work item (notzed’s compiler point) and between work items - so any code where the order of operations is dependent on hardware scheduling may be nondeterministic even if locks etc are used correctly. Without hardware error control there is also the possibility that the occasional error creeps in, but this is unlikely to show up in anything but very long runs.

Thanks LeeHowes!

However, why does the order between work items influence the operation result?

Thanks!

As notzed said:

if different instructions or instruction order is created as the associativity of floating point operations is data dependent. e.g. http://en.wikipedia.org/wiki/Floating_p … y_problems

For example, if you add a random sequence:
2^-30, 2^30, 2^-30, 2^30 …
you’ll end up getting the same as:
0, 2^30, 0, 2^30 …

because the mantissa of a float isn’t big enough to hold the extra bit of precision.
However, if you sorted the list:
2^-30… 2^30…

You would be able to add pairs of 2^-20 and if you add enough the sum may be enough to affect the addition with 2^30, and the result could be significantly different given the right combination of values and length of the list. That’s an extreme case, in the general case any operation of floats incurs a possible loss of precision, and the order in which you do the operations changes the particular information that is lost.

I see. Thanks!