CUDA mex function using real data residing on the host and producing real results on the host

In the CUDA_mex_host_to_device GitHub directory, we provide an example on how creating a mex function executing on the GPU when the input real data reside on the host and the final results are returned on the host. The first thing to do is to recover the pointer to the first element of the real data from the Matlab input array/matrix: double *h_input = mxGetPr(prhs[0]); We can also recover the number of elements of the input variable (the input variable can be also a matrix) as: in...
More

Compiling mex files with Visual Studio 2013

Configuration: Matlab 2015b, Visual Studio 2013, Intel 64bit machine. In Visual Studio do the following: 1) File -> New Project; Select location and name; in the project type, select Templates -> Visual C++ -> Win32 -> Win32 Console Application -> OK; 2) In the Win32 Application Wizard, click Next, in the Application Type choose DLL, then click Finish. 3) Project -> Properties -> Configuration Manager -> Active Solution Platform -> New -> Type or Select ...
More

Radix-4 Decimation-In-Frequency Iterative FFT

On our GitHub web page, we have made available a fully worked Matlab implementation of a radix-4 Decimation in Frequency FFT algorithm. In the code, we have also provided an overall operations count in terms of complex matrix multiplications and additions. It can be indeed shown that each radix-4 butterfly involves 3 complex multiplications and 8 complex additions. Since there are log4N = log2N / 2 stages and each stage involves N / 4 butterflies, so the operations count is complex mul...
More

Understanding the radix-2 FFT recursive algorithm

The recursive implementation of the radix-2 Decimation-In-Frequency algorithm can be understood using the following two figures. The first one refers to pushing the stack phase, while the second one illustrates the popping the stack phase.   In particular, the two figures illustrate the Matlab implementation that you may find on our GitHub website: Implementation I Implementation II The above recursive implementation is the counterpart of the iterative implemen...
More

Radix-2 Decimation-In-Frequency Iterative FFT

At the github page, we prove an implementation of the radix-2 Decimation-In-Frequency FFT in Matlab. The code is an iterative one and considers the scheme in the following figure: A recursive approach is also possible. The implementation calculates also the number of performed multiplications and additions and compares it with the theoretical calculations reported in “Number of operation counts for radix-2 FFTs”. The code is obviously much slower than the highly optimized FFTW explo...
More

Radix-2 Decimation-In-Time Iterative FFT

At the github page, we prove an implementation of the radix-2 Decimation-In-Time FFT in Matlab. The code is an iterative one and considers the scheme in the following figure:   A recursive approach is also possible. The implementation calculates also the number of performed multiplications and additions and compares it with the theoretical calculations reported in “Number of operation counts for radix-2 FFTs”. The code is obviously much slower than the highly optimized FFTW ...
More

Compiling Cuda mex files with Visual Studio 2013

Configuration: Matlab 2015b, Visual Studio 2013, Intel 64bit machine. In Visual Studio do the following: 1) File -> New Project; Select location and name; in the project type, select NVIDIA -> CUDA 8.0 (choose your CUDA version as appropriate); 2) Project -> Properties -> Configuration Manager -> Active Solution Platform -> choose x64; 3) Project -> Properties -> Configuration -> Release (possibly optional); 4) Project -> Properties -> Configuration ...
More

Sparse 3D Matrices in Matlab

As known, Matlab does not directly deal with 3D matrices. A workaround is using cell arrays of sparse matrices. Suppose that you want to create a sparse matrix containing only the elements (1, 1, 1) and (1, 3, 50) and suppose that A(1, 1, 1) = 1 and A(1, 3, 10) = 54. You can do the following: mySp{1}  = sparse(3, 3); mySp{10} = sparse(3, 3);   mySp{1}(1, 1) = 3; mySp{10}(1, 3) = 54; In this way, >> mySp   mySp = [3x3 double]    []    []    []    []    []...
More

Emulating Matlab’s meshgrid in CUDA

On our GitHub webpage, we are posting a worked example implementing in CUDA the classical Matlab's meshgrid function. In Matlab: x = [1 2 3]; y = [4 5 6 7]; [X, Y] = meshgrid(x, y); produces: X = 1     2     3 1     2     3 1     2     3 1     2     3 and Y = 4     4     4 5     5     5 6     6     6 7     7     7 X is exactly the four-fold replication of the x array, while Y is the three-fold consecutive replication of each element of the y array.
More

Tricks and Tips: Exchanging data between Matlab and C++ in ASCII format

To save data from C++ in a txt format file, do the following (full worked example to store an array of N double precision real numbers): #include <fstream> int main() {     const int N = 6;     double *U = (double*)malloc(N * sizeof(double));          for (int i=0; i<N; i++) U[i] = 2.5 * (double)i;          std::ofstream outfile;     outfile.open("data.txt");     for(int i=0; i<N; i++){         outfile << U[i];         outfile << "n";     }     outfile.close();     r...
More