Search:

Type: Posts; User: Bilog

Page 1 of 3 1 2 3

Search: Search took 0.00 seconds.

  1. Replies
    57
    Views
    15,929

    Sticky: This is in fact something that is sorely missing...

    This is in fact something that is sorely missing in OpenCL: a way to guarantee host buffer pinning for memory transfer. The only moderately reliable approach I've found is to use ALLOC_HOST_PTR, and...
  2. Sticky: I didn't have a specific application in mind, I...

    I didn't have a specific application in mind, I just thought it might be practical for introspection.



    I understand. I assume the OpenCL specification will be amended accordingly, too?

    ( I...
  3. Sticky: A couple of notes: * clSetKernelExecInfo has...

    A couple of notes:

    * clSetKernelExecInfo has no “complementary” clGetKernelExecInfo call;
    * the OpenCL C++ specification mentions (page 288, section 4.1) that “static selection of rounding mode...
  4. Read the data as a char16 and use as_uint4.

    Read the data as a char16 and use as_uint4.
  5. Replies
    1
    Views
    592

    OpenCL 2.x has a different memory model than...

    OpenCL 2.x has a different memory model than OpenCL 1.x, to fit SVM into the model. In OpenCL 1.x, all memory regions were device-specific. In OpenCL 2.x, the global and constant memory regions are...
  6. The preferred vector width is the vector width...

    The preferred vector width is the vector width that you should try to use on the device. If CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT is 4, for example, it means that on that device you should try to use...
  7. You are seeing the effect of the so-called...

    You are seeing the effect of the so-called delayed or lazy allocation, which is a very common technique used in many implementations.

    Basically, when an OpenCL buffer is created, it is not...
  8. Thank you very much for the reply, it's greatly...

    Thank you very much for the reply, it's greatly appreciated. Looking forward to the updated spec.
  9. Replies
    28
    Views
    13,336

    C++14 doesn't define built-in vector types nor...

    C++14 doesn't define built-in vector types nor memory spaces either, yet OpenCL C++ still has them. Removing the restrict keyword will also cause performance regressions in all those kernels and...
  10. Your platform might have limitations on the local...

    Your platform might have limitations on the local work size in the second and third dimensions. You can check this by retrieving the CL_DEVICE_MAX_WORK_ITEM_SIZES property, which returns a list of...
  11. Replies
    28
    Views
    13,336

    Another point that needs clarification (aside...

    Another point that needs clarification (aside from the meaning of CL_DEVICE_VENDOR_ID) is the behavior of sub-devices in terms of (pre-)existing contexts. I've opened a specific discussion about this...
  12. Ambiguity in the specification about sub-devices and contexts

    Hello all,

    what is the correct behavior in the cases of sub-devices created _after_ context creation?

    Let's say that I create a context C that only includes a single device, devA. I then...
  13. Replies
    28
    Views
    13,336

    An additional point, concerning the available...

    An additional point, concerning the available device information:

    * as I mentioned, it would be better to have device info entry about the supported OpenCL C++ version; while currently there is...
  14. Replies
    28
    Views
    13,336

    Absolutely agreed. An important case where the...

    Absolutely agreed. An important case where the high-leve feature exposed in OpenCL C (or C++) would be better replaced by lower-level functions is that of work-group and subgroup scans and...
  15. Replies
    28
    Views
    13,336

    A few things I've noticed on the first read of...

    A few things I've noticed on the first read of the OpenCL C++ 1.0 draft:

    * a minor missing point is that there is no device property retrievable by `clGetDeviceInfo` about the supported OpenCL C++...
  16. According to the specification, the requirement...

    According to the specification, the requirement is that the kernel signature (number and type of arguments) should be the same for all devices for which the program was built. If you build different...
  17. The preferred wg size multiple is what the OpenCL...

    The preferred wg size multiple is what the OpenCL platforms thinks the local workgroup size should be a multiple of to achieve optimal performance. On NVIDIA GPUs, this is always returned as the warp...
  18. Replies
    32
    Views
    24,896

    work_group_prefixsum_{inclusive,exclusive}_{add,mi...

    work_group_prefixsum_{inclusive,exclusive}_{add,min,max} functions are not named correctly, since they are not necessarily additions. Is it too late to change them to...
  19. Replies
    3
    Views
    2,040

    -52 is CL_INVALID_KERNEL_ARGS, and indeed you are...

    -52 is CL_INVALID_KERNEL_ARGS, and indeed you are passing 4 args to a kernel that needs 5 of them.
  20. Replies
    8
    Views
    3,104

    You should probably report your problem to AMD...

    You should probably report your problem to AMD (they have a forum dedicated to OpenCL questions and issues over at their devgurus.amd.com site)
  21. Replies
    1
    Views
    1,469

    Nothing. Since OpenCL has separate sources for...

    Nothing. Since OpenCL has separate sources for the host and device parts, there is no need to qualify device functions.
  22. Replies
    1
    Views
    2,631

    In OpenCL all functions are automatically inlined.

    In OpenCL all functions are automatically inlined.
  23. Replies
    4
    Views
    2,506

    Re: get_global_id is undefined

    get_global_id() is a built-in of OpenCL C, so it is only defined inside of kernels. Are you trying to use it in host code? please post a minimal buildable example showing the problem.
  24. Replies
    3
    Views
    2,162

    Re: warp size vs # of SPs per SM

    On Fermi, each warp is physically executed as two half-warps; the 2.1 devices can effectively run 3 half-warps at once. (The thing is actually more complex, due to the device ability to issue more...
  25. Replies
    3
    Views
    2,444

    Re: running on GPU but not on CPU

    Are you using the Intel OpenCL SDK on an AMD CPU? In my experience, this combination doesn't work, while the reverse (AMD APP with Intel CPU) works.
Results 1 to 25 of 51
Page 1 of 3 1 2 3
Proudly hosted by Digital Ocean