Programming on the GPU with CUDA

Nowadays the Graphics Processing Unit (GPU) is a mainstream hardware component in high-performance computing. For affordable budgets anyone can harness supercomputer performance. However, realizing efficient parallelism combines three knowledge areas: firstly, on the architecture and compute capabilities of the GPUs;  then, on special constructs for programming a GPU-equipped computer; finally, on the special algorithms for performing logical and mathematical operations in parallel.

What is CUDA? 

CUDA stands for Compute Unified Device Architecture.  It is a software-development tool kit for programming on the GPUs maintained by the mainstream manufacturer Nvidia. CUDA provides language extensions for C, C++, FORTRAN, and Python as well as knowledge-specific libraries. A single source code is then able to instruct the CPU and GPU alike. Also, CUDA-extended codes keep pace closely with the rapid developments in the underlying technology.

Goals and prerequisites

To guide you in this development niche, the Delft Institute for Computational Science and Engineering (DCSE) offers a CUDA course every quarter. We will explain basic principles and advanced topics on GPU programming with CUDA. You will apply these notions in our labroom with hands-on examples. After this course you will be able to get simple CUDA programs running on a GPU-equipped computer.
As prerequisite, a rudimentary understanding of programming languages like C++ or Java is ideal; that of Fortran or Python will be helpful too. Some interest in linear algebra and iterative solvers is a little advantage.

Programme

09:15 - 09:30 Arrival with coffee and tea
09:30 - 10:30 Parallel computing on GPU’s Prof.dr.ir. C. Vuik
10:45 - 11:30 GPU’s design and architecture Ir. C.W.J. Lemmens
11:45 - 12:30 Lab 1: CUDA introduction Ir. C.W.J. Lemmens

12:45 - 13:30 Lunch

13:30 - 14:45 Lab 2: Using CuBlas, CuFFT Ir. C.W.J. Lemmens
15:00 - 17:00 Lab 3: Debugging and Profiling Ir. C.W.J. Lemmens

09:15 - 09:30 Arrival with coffee and tea
09:30 - 10:30 Lab 4: Shared memory, streams, atomics, Ir. C.W.J. Lemmens
10:45 - 12:30 Lab 5: Optimising code: Innerproduct , Ir. C.W.J. Lemmens

12:45 - 13:30 Lunch

13:30 - 14:30 Lab 6: Parallel solvers on GPU's, Dr. Matthias Möller
14:45 - 16:30 Lab 7: Solvers on GPU, Dr. Matthias Möller
16:30 - 17:00 Lab 8: Solvers on GPU (continued) Dr. Matthias Möller

09:15 - 09:30 Arrival with coffee and tea
09:30 - 10:30 Lab 4: Shared memory, streams, atomics, Ir. C.W.J. Lemmens
10:45 - 12:30 Lab 5: Optimising code: Innerproduct , Ir. C.W.J. Lemmens

12:45 - 13:30 Lunch

13:30 - 14:30 Lab 6: Tensorflow, PyTorch Dr. Matthias Möller
14:45 - 16:30 Lab 7: Using tensor cores directly, Dr. Matthias Möller
16:30 - 17:00 Lab 8: Physics machine learning with modules, Dr. Matthias Möller

More information

Planned courses

/* */