Tricks and Tips: Submatrix multiplication in CUDA using cuBLAS

We are here providing a full example on how using cublas <t>gemm to perform multiplications between submatrices of full matrices A and B and how assigning the result to a submatrix of a full matrix C.

The code makes use of

  1. pointer arithmetics to access submatrices;
  2. the concept of the leading dimension and of submatrix dimensions.


The code available on our GitHub page considers three matrices:

  1. A – 10 x 9;
  2. B – 15 x 13;
  3. C – 10 x 12.

Matrix C is initialized to all 10s.
The code performs the following submatrix multiplication in Matlab language:

C(1+x3:5+x3,1+y3:3+y3) = A(1+x1:5+x1,1+y1:4+y1) * B(1+x2:4+x2,1+y2:3+x2);

