Search:

Type: Posts; User: utnapishtim

Page 1 of 6 1 2 3 4

Search: Search took 0.00 seconds.

  1. Replies
    1
    Views
    189

    Since you're copying from a buffer to another...

    Since you're copying from a buffer to another buffer, you have to "signal" the source buffer that its content has changed. This is done by unmapping the source buffer before copying.

    If, as I...
  2. Replies
    2
    Views
    271

    inputA and inputB are out parameters: int...

    inputA and inputB are out parameters:



    int MapBuffers(ocl_args_d_t *ocl, cl_float **inputA, cl_float **inputB, size_t dataSize)
    {
    *inputA = (cl_float*)clEnqueueMapBuffer(ocl->commandQueue,...
  3. Replies
    3
    Views
    340

    From OpenCL C specs (Alignment of Types): "For...

    From OpenCL C specs (Alignment of Types):

    "For 3-component vector data types, the size of the data type is 4 * sizeof(component). This means that a 3-component vector data type will be aligned to...
  4. Replies
    5
    Views
    544

    Use clGetDeviceInfo() with parameters...

    Use clGetDeviceInfo() with parameters CL_DEVICE_QUEUE_ON_DEVICE_ PREFERRED_SIZE and CL_DEVICE_QUEUE_ON_DEVICE_ MAX_SIZE.
    The minimum value for CL_DEVICE_QUEUE_ON_DEVICE_ PREFERRED_SIZE is 16KB, so...
  5. Please note that you can permute the two loops in...

    Please note that you can permute the two loops in your kernel to replace O(n^2) read/writes to stackTracesOut and powerTracesOut by O(n) writes.
    Also use two temporary variables to compute the sums,...
  6. Your GPU has 44 Compute Units, each one able to...

    Your GPU has 44 Compute Units, each one able to handle a work group of 256 work items.

    Homework:

    Think about what happens when you make a call such as: CheckFaster(25000, 2500, 250, 1, 1, ...
  7. Replies
    3
    Views
    1,151

    Atomic operations can be very slow on GPUs...

    Atomic operations can be very slow on GPUs (before Kepler for NVidia for instance), so we try to avoid them if they occur frequently in a kernel.

    Recent AMD GPUs have hardware atomic counters...
  8. Replies
    3
    Views
    1,151

    A simple way (although not the most efficient if...

    A simple way (although not the most efficient if most of the pixels are to be returned) is to use an atomic counter.
    Your device has to support the cl_khr_global_int32_base_atomics extension.
    ...
  9. Your kernel must not modify the w coordinate....

    Your kernel must not modify the w coordinate. Remember that a 4D-point (x, y, z, w) is equivalent to a 3D-point (x/w, y/w, z/w) (to make it simple).
    You can consider (x, y, z, 1) as a point and (x,...
  10. Is it normal that the global work size is three...

    Is it normal that the global work size is three times larger than the buffer size?
  11. Replies
    1
    Views
    673

    1) This is true for desktop platforms. exp()...

    1) This is true for desktop platforms. exp() should return HUGE_VALF, which evaluates to +infinity.

    Embedded platforms do not necessarily handle infinity, in which case HUGE_VALF is the largest...
  12. Your OpenGL position is made of 3 consecutive...

    Your OpenGL position is made of 3 consecutive floats, whereas your OpenCL kernel expects a position made of 4 consecutive floats.

    So either use a position with 4 floats and pass a stride parameter...
  13. And in your initial case, use an internal format...

    And in your initial case, use an internal format GL_RGBA32F instead of GL_RGBA8 for a data type GL_FLOAT.
  14. Try with GL_RGBA8 as internal format instead of...

    Try with GL_RGBA8 as internal format instead of GL_RGBA.
  15. When you create a texture with glTexImage2D() and...

    When you create a texture with glTexImage2D() and a null data pointer, the texture is incomplete and clCreateFromGLTexture() fails with CL_INVALID_GL_OBJECT.
    You have to create it with a non-null...
  16. Where do you set the arguments of your kernel ?

    Where do you set the arguments of your kernel ?
  17. Replies
    1
    Views
    622

    /* Read the kernel's output */ err =...

    /* Read the kernel's output */
    err = clEnqueueReadBuffer( queue, results_buffer, CL_TRUE, 0, nbPoints * sizeof(float), results, 0, NULL, NULL);
  18. Replies
    2
    Views
    836

    I doubt that your kernel can compile: srand(),...

    I doubt that your kernel can compile: srand(), time() and rand() are not part of OpenCL C.
  19. From OpenCL specs about memory consistency:...

    From OpenCL specs about memory consistency: "Global memory is consistent across work-items in a single work-group at a work-group barrier, but there are no guarantees of memory consistency between...
  20. Replies
    7
    Views
    1,615

    Only powers of two can be exactly represented by...

    Only powers of two can be exactly represented by binary floating-point formats such as float or double.

    Since 3.03 or 7.03 are not powers of two, they simply cannot be exactly represented.

    For...
  21. You can't, but there's nothing wrong with a loop...

    You can't, but there's nothing wrong with a loop in a kernel.
  22. CL_DEVICE_LOCAL_MEM_SIZE returns the max amount...

    CL_DEVICE_LOCAL_MEM_SIZE returns the max amount of local memory that a work-group can allocate (and use). Since a work-group can run on only one compute unit, this amount of memory is for each...
  23. If you use the CPU device and your app is...

    If you use the CPU device and your app is compiled for x64, get_global_id() returns a size_t value with is 64-bit wide.
    In this case, as_uchar4(get_global_id(0)) is not legal.

    You should first...
  24. Each compute unit has 32 ALU. So the device has a...

    Each compute unit has 32 ALU. So the device has a total of 4x32=128 ALU.
    Each compute unit can run a work-group of up to 512 work-items.
  25. A work-group runs on one compute unit. It cannot...

    A work-group runs on one compute unit. It cannot be split among several compute units (first of all because local memory is local to a compute unit).

    The max work-group size is an indication of...
Results 1 to 25 of 149
Page 1 of 6 1 2 3 4
Proudly hosted by Digital Ocean