Finite Impulse Response (FIR) Filter in CUDA implemented as a 1D convolution

In this post, we consider the problem of calculating the output of a FIR (Finite Impulse Response) filter by directly evaluating the 1D convolution in CUDA. In the case when the filter impulse response duration is long, one thing that can be done to evaluate the filtered input is performing the calculations directly in the conjugate domain using FFTs. On our GitHub website, a sample code using the cuFFT library is reported. It is a direct translation of the Matlab-based example reported at Low...
More

Tricks and Tips: 1D batched FFTs of real arrays

As reported in the cuFFT User Guide - CUDA 6.5, batch sizes other than 1 for cufftPlan1d() have been deprecated. Use cufftPlanMany() for multiple batch execution. Below, a fully worked example using cufftPlanMany() instead of cufftPlan1d() is reported. As it can be seen, int rank = 1; // --- 1D FFTs int n[] = { DATASIZE }; // --- Size of the Fourier transform int istride = 1, ostride = 1; // --- Distance between two successive input/output elements int idist = DATASIZE, odist = (DATASIZE / 2 + ...
More

Asynhcronous executions of CUDA memory copies and cuFFT

Suppose we have to calculate the FFTs of an array of size N. In this post, we face the question whether is it possible to hide the latency by concurrently executing the cuFFT with the memory copy of the array from the host to the device. In other words, we want to copy a first part of the array from host to device, begin the calculations on this portion of the array, then concurrently copying a second part of the array and so on. This behavior can be "emulated" with a proper use of zero padding...
More

CuFFT and streams – Kepler architecture

This is a worked example of cuFFT execution and memcopies using streams in CUDA on the Kepler architecture. #include <stdio.h> #include <cufft.h> #define NUM_STREAMS 3 /********************/ /* CUDA ERROR CHECK */ /********************/ #define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); } inline void gpuAssert(cudaError_t code, char *file, int line, bool abort=true) { if (code != cudaSuccess) { fprintf(stderr,"GPUassert: %s %s %dn", cudaGetErrorString(code)...
More