Results 1 to 2 of 2

Thread: OpenCL bit-matrix multiplication (implementing tiling in local memory)

  1. #1
    Newbie
    Join Date
    Mar 2018
    Posts
    2

    OpenCL bit-matrix multiplication (implementing tiling in local memory)

    Hello all

    I am performing multiplication between two square bit-matrices using a specific formula (not the dot product of row/col, I will describe it below). My implementation works correctly but it blows up when I attempt to do tiling in the local memory. Even though I understand the principles behind it (i.e. putting some of my data in local arrays, then using barriers to synchronize my work-items), I keep getting the wrong results.

    This is an example of how I setup the matrices:
    Code :
    int N = 256; // number of bits in a row/col (or any number that is evenly divisible by 64)
    unsigned long long A = new unsigned long long[n*n/64]; // stores bits in 64-bit integers
    unsigned long long B = new unsigned long long[n*n/64]; // for example, one row consists of 256 bits and uses 4x64-bit integers to store them
    int C  = new int[n*n];

    This is the actual formula:
    Row i of matrix A is XORed with row j of matrix B. Yes, it is a row to row operation. Then, I count the number of 1s and assign the sum to C[i][j].

    This is how I launch the kernel:
    Code :
    const size_t global[2] = { n, n };
    clEnqueueNDRangeKernel(queue, kernel, 2, NULL, global, 0, 0, NULL, &event);

    This is the actual kernel which works correctly:
    Code :
        __kernel void BitProduct(const int N, const __global ulong* A, const __global ulong* B, __global int* C)
        {
        	const int i = get_global_id(0);
        	const int j = get_global_id(1);
     
        	ulong sum = 0;
     
        	for (int k = 0;k < N/64;k++)
        		sum += popcount( A[ i*(N/64) + k ] ^ B[ j*(N/64) + k ] );
     
        	C[ i * N + j ] = (int) sum;
        }

  2. #2
    Newbie
    Join Date
    Mar 2018
    Posts
    2
    I made a typo in the first post. I forgot to add the * in my original post. Of course I am initializing a pointer to dynamically allocated memory.

    Code :
    int N = 256; // number of bits in a row/col (or any number that is evenly divisible by 64)
    unsigned long long *A = new unsigned long long[n*n/64]; // stores bits in 64-bit integers
    unsigned long long *B = new unsigned long long[n*n/64]; // for example, one row consists of 256 bits and uses 4x64-bit integers to store them
    int C  = new int[n*n];

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Proudly hosted by Digital Ocean