One possibility to solve general sparse linear systems in CUDA is using cuSOLVER.

CuSOLVER has three useful routines:

1. cusolverSpDcsrlsvlu, which works for *square* linear systems (number of unknowns equal to the number of equations) and internally uses sparse LU factorization with partial pivoting;

2. cusolverSpDcsrlsvqr, which works for *square* linear systems (number of unknowns equal to the number of equations) and internally uses sparse QR factorization;

3. cusolverSpDcsrlsqvqr, which works for *rectangular* linear systems (number of unknowns different to the number of equations) and internally solves a least square problem.

Currently, the above three functions are for the *host channel* only, which means that they do not yet run on GPU. They must be called as:

1. cusolverSpDcsrlsvluHost;

2. cusolverSpDcsrlsvqrHost;

3. cusolverSpDcsrlsqvqrHost.

and the input argument should all reside on the host.

On our GitHun website a fully worked example showing the use of the above three mentioned routines is reported.