In this post, we deal with the calculation of the residual between two vectors. In other words, we want to calculate the residual defined by the following pseudocode
residual = 0; forall i : residual += (oldvector[i] - newvector[i])^2
This operation can be conveniently performed by CUDA CUB.
At variance with Thrust, CUB leaves performance-critical parameters, such as the choice of specific reduction algorithm to be used and the degree of concurrency unbound, selectable by the user.
These parameters can be tuned in order maximimize performance for a particular architecture and application.
The parameters can be specified at compile time, so avoiding runtime performance penalties.
On our GitHub website a fully worked example is reported.