Using OpenMP and CUDA with Visual Studio

To exploit OpenMP under C++ and using Visual Studio 2010 or 2013 you need two steps: 1) #include <omp.h> 2) Project -> Properties -> Configuration Properties -> C/C++ -> Language ->  Open MP Support and then select Yes (/openmp). To exploit OpenMP under CUDA and using Visual Studio 2010 or 2013 you need some more hacking. In particular, you need the additional step: 3) Go to C:Program FilesNVIDIA GPU Computing ToolkitCUDAv5.5bin, open nvcc.profile as Administrator and...
More

Arduino on Jetson TK1 – Simple serial port reader for Ubuntu with C++

In this post, we will show how to develop a simple serial port reader for Ubuntu using C++; it is the first step if you want to enable a device (your Arduino project, for example) communicating via software with a Jetson TK1 board. As a demo project, we will use the simple test of photo resistor functionalities from Arduino Playground. Ubuntu, as any other Linux distribution, provides access to serial ports as device files; in order to access the Arduino serial port, you have to find the corre...
More

Tricks and Tips: Timing CPU and GPU (CUDA) operations

In this post, timing of both C/C++ as well as CUDA operations is shortly discussed. You will just need to add the 4 files reported below to your project and #include the two header files as: // --- Timing includes #include "TimingCPU.h" #include "TimingGPU.cuh" The two classes can be used as follows. Timing CPU section TimingCPU timer_CPU; timer_CPU.StartCounter(); CPU perations to be timed std::cout << "CPU Timing = " << timer_CPU.GetCounter() << " ms" << std::endl...
More

Tricks and Tips: Exchanging data between Matlab and C++ in ASCII format

To save data from C++ in a txt format file, do the following (full worked example to store an array of N double precision real numbers): #include <fstream> int main() {     const int N = 6;     double *U = (double*)malloc(N * sizeof(double));          for (int i=0; i<N; i++) U[i] = 2.5 * (double)i;          std::ofstream outfile;     outfile.open("data.txt");     for(int i=0; i<N; i++){         outfile << U[i];         outfile << "n";     }     outfile.close();     r...
More

Tricks and Tips: Exchanging data between Matlab and C++ in binary format

To save data from C++ in a binary format file, do the following (full worked example to store an array of N double precision real numbers): #include <fstream> int main() {     const int N = 6;     double *U = (double*)malloc(N * sizeof(double));     for (int i=0; i<N; i++) U[i] = i;     std::ofstream outfile;     outfile.open("file.dat", std::ios::out | std::ios::binary);     outfile.write((char*)U, N*sizeof(double));     outfile.close();     return 0; } To load data stored in bi...
More

1D linear interpolation in CUDA

In this post, we present the full implementation of 1D linear interpolation using CUDA. You will see that the code performs the 1D linear interpolation in four different ways: •    CPU; •    GPU; •    GPU using tex1Dfetch; •    GPU using tex1D filtering. The code uses one of the last features of CUDA 6.0 and of latest CUDA cards, namely, unified memory (cudaMallocManaged). // includes, system #include <cstdlib> #include <conio.h> #include <math.h> #include <fstream> #i...
More

Implementations of the BiConjugate Gradient Stabilized optimizer on CPU and GPU

The BiConjugate Gradient Stabilized (BiCGStab) optimizer, see http://en.wikipedia.org/wiki/Biconjugate_gradient_stabilized_method, can be easily and efficiently implemented on both CPU and GPU by making a massive use of BLAS/cuBLAS since the code is based on calculating matrix-vector multiplications, scalar products and norms. For the GPU side, it is worth having one's own implementation of the BiCGStab optimizer since the sample contained in the CUDA SDK refers to sparse linear systems, while ...
More

Gaussian elimination with CUDA

The VS project Gaussian elimination with CUDA, that you may find in download section, contains CPU and GPU routines for solving a linear system of equations by Gaussian elimination without pivoting. Besides providing standard CPU Gaussian elimination and solution of an upper triangular system, sequential and parallel codes have been developed based on the paper: Manuel Carcenac, "From tile algorithm to stripe algorithm: a CUBLAS-based parallel implementation on GPUs of Gauss method for the resol...
More

Bluebird Library – New Features: Stand-alone components for Complex Type management

Orange Owl Solutions introduces new stand-alone components for Complex Type management in the Bluebird library: Thanks to the power of CUDA/C++ metaprogramming, you will be able to develope high performance solutions in a easy and intuitive way. Main features The main features of current beta version (0.5) are: C++/CUDA Metaprogramming Simple (Matlab/Octave-like) ways to manage vectors and matrices Stand-alone components for Complex Type management Peer-to-Peer (P2P) communication betwee...
More

Bluebird library

Orange Owl Solutions introduces the new Bluebird library for the fast coding of scientific computing on GPUs and CPUs. Thanks to the power of CUDA/C++ metaprogramming, you will be able to develope high performance solutions in a easy and intuitive way. Main features The main features of current beta version are: C++/CUDA Metaprogramming; simple (Matlab/Octave-like) ways to manage vectors and matrices; We are working to release extensive function library (e.g., interpolation, special funct...
More