CUDA, GPGPU, Parallel communication pattern 5 -> Map, Gather, Scatter, Stencil, Transpose

1. Map : Tasks read from and write to specific data elements.

2. Gather : each calculation gathers input data elements together from different places to compute an output result.

3. Scatter : tasks compute where to write output

4. stencil

5. transpose


*out[i] = pi * in[i]
There is a 1-to-1 correspondence between the output and the input, so that's clearly an Map operation.

*out[i+j*128] = in[j+i*128];
i, j is reorder the array, so this is Transpose operation

*out[i-1] += pi * in[i]; out[i+1] += pi * in[i]
the value of calculation is placing the into a couple of different places in the output.
So Scatter operator

*out[i] = (in[i] + in[i-1] + in[i+1])* pi/3.0f;
every thread is writing a single location in the output array, and it's reading from multiple places in the input array, locations that it computes.
this looks very much like a stencil operation since it's reading from a local neighborhood, but because " if(i%2) ", it's not writing into every location.

image captured from UdaCity


  1. I think that thanks for the valuabe information and insights you have so provided here. reservations.com

  2. Great survey, I'm sure you're getting a great response. the page