CUDA Timing for Multi-GPU Applications

Let us consider the example in last post where it has been underlined how using asynchronous copies enables achieving true multi-GPU concurrency. In particular, let us consider Test case #8 of that post. The full code of Test case #8 is available on our GitHub website, while the profiler timeline is reported here for the sake of clarity: The full code for the timing example here reported is available on our GitHub website. Timing the asynchronous copies - concurrency is destroyed Now...
More

Tricks and Tips: Timing CPU and GPU (CUDA) operations

In this post, timing of both C/C++ as well as CUDA operations is shortly discussed. You will just need to add the 4 files reported below to your project and #include the two header files as: // --- Timing includes #include "TimingCPU.h" #include "TimingGPU.cuh" The two classes can be used as follows. Timing CPU section TimingCPU timer_CPU; timer_CPU.StartCounter(); CPU perations to be timed std::cout << "CPU Timing = " << timer_CPU.GetCounter() << " ms" << std::endl...
More