Tricks and Tips: Submatrix multiplication in CUDA using cuBLAS

We are here providing a full example on how using cublas <t>gemm to perform multiplications between submatrices of full matrices A and B and how assigning the result to a submatrix of a full matrix C. The code makes use of pointer arithmetics to access submatrices; the concept of the leading dimension and of submatrix dimensions.   The code available on our GitHub page considers three matrices: A - 10 x 9; B - 15 x 13; C - 10 x 12.   Matrix C is initialized to all 10s....
More

Tricks and Tips: Covariance calculation with CUDA

Covariance calculation in CUDA can be easily performed by using cuBLAS in conjunction with Thrust. Considering N realizations of K random variables, the covariance estimation formula is the following: where qjk, j,k=1,...,K are the covariance estimate values,  and   are the random variable means as estimated from the available realizations. A fully worked example on how calculating the covariance matrix with Thrust and cuBLAS is reported on our GitHub page.
More

Tricks and Tips: Reduce matrix columns with CUDA

We here reporting 4  approaches for column matrix reduction, 3 of them based on using CUDA Thrust and 1 based on using cublas<t>gemv() with a column of 1's. The CUDA Thrust approaches are the analogous of our previous post: Reduce matrix rows with CUDA with an implicit transposition obtained by thrust::make_permutation_iterator(d_matrix.begin(), thrust::make_transform_iterator(thrust::make_counting_iterator(0), (_1 % Nrows) * Ncols + _1 / Nrows)) The full code is reported on our github...
More

Tricks and Tips: Reduce matrix rows with CUDA

Reducing the rows of a matrix can be solved by using CUDA Thrust in three ways (they may not be the only ones, but addressing this point is out of scope here). Also, an approach using cuBLAS is possible. APPROACH #1 - reduce_by_key This is the approach suggested at this Thrust example page. It includes a variant using make_discard_iterator. APPROACH #2 - transform This is the approach suggested by Robert Crovella at CUDA Thrust: reduce_by_key on only some values in an array, based off values ...
More