Meaning of threadIdx, blockIdx, blockDim, gridDim in the cuda (1D)

When we study cuda firstly, thread indexing is very confusing.
So I tried to clean up.

First, Let's grab a sense of looking at this example
#include "cuda_runtime.h"
#include "device_launch_parameters.h"

#include < stdio.h>

#define N 15

__global__ void increase(int *c){
 int tid = threadIdx.x + blockIdx.x * blockDim.x;
 if(tid < N)
  c[tid] = tid;

int main(void)
 int c[N];
 int *dev_c;

 cudaMalloc( (void**)&dev_c, N*sizeof(int) );

 for(int i=0; i< N; ++i)
  c[i] = -1;

 cudaMemcpy(dev_c, c, N*sizeof(int), cudaMemcpyHostToDevice);

 increase<<< 4, 3>>>(dev_c);

 cudaMemcpy(c, dev_c, N*sizeof(int), cudaMemcpyDeviceToHost );

 for(int i=0; i< N; ++i)
  printf("c[%d] = %d \n" ,i, c[i] );

 cudaFree( dev_c );

The result of this example source is

In the source code, kernel function(increase) is created by <<< 4, 3 >>>, this means to create 12 threads.
12 threads are executed at the same time.
So, the kernel function need to know what number of thread am I?
The method is threadIdx and blockIdx.

But we have to calculate thread index, because threadIdx and blockIdx is different space index.
like that " int tid = threadIdx.x + blockIdx.x * blockDim.x; "

threadIdx tells current thread index.
blockIdx tells current block index.

gridDim tells number of blocks in a grid
blockDim tells number of a threads in a block

Did you more confused?

My final explanation. See the this figure.

Next time, I will introduce 2D kernel.
Thank you.

No comments:

Post a Comment