Using CUDA dynamic parallelism in Visual Studio

Starting from CUDA 5.0, CUDA enables the use of dynamic parallelism for GPUs with compute capability 3.5 or higher. Dynamic parallelism allows launching kernels directly from other kernels and enables further speedups in those applications which can benefit of a better handling of the computing workloads at runtime directly on the GPU; in many cases, dynamic parallelism avoids CPU/GPU interactions with benefits to mechanisms like recursion. To use dynamic parallelism in Visual Studio 2010 or ...
More

Calling CUDA Thrust primitives from within a kernel

Starting from Thrust 1.8, CUDA Thrust primitives can be combined with the thrust::seq execution policy to run sequentially within a single CUDA thread (or sequentially within a single CPU thread). Starting from Thrust 1.8.1, CUDA Thrust primitives can be combined with the thrust::device execution policy to run in parallel within a single CUDA thread exploiting CUDA dynamic parallelism. An example of both is reported on our GitHub website. The example performs reductions of the rows of a matri...
More