Schnelleinstieg Reader


Startseite FSU

Programming with CUDA

Summersemester 2011

Advances in GPU hardware have made GPUs computationally far superior to CPUs. State of the art GPUs achieve higher GFLOPs than CPUs at lesser temperatures. Thus, while thermal factors may slow down progress in CPU hardware development, no such barrier is currently in sight for GPUs.



The computational power of GPUs has so far been restricted to niche graphics programmers because of hardware and API restrictions. However, many non-graphics applications can also benefit from higher computational capabilities.

CUDA, Compute Unified Device Architecture, is a new technogloy that lets ordinary programmers harness the computational powers of modern GPUs. On the software side, it is a minimal set of extensions to the popular C programming language, and on the hardware side, it comprises of the hardware that supports this programming language.

For previous courses about CUDA see here.

 Quick links : Organization | Literature | Lectures | Exercises | Projects | Exams


Lectures will take place on
Monday 4pm-6pm in CZ SR 123
Tuesday 4pm-6pm
in AB4 SR 003

Exercises will take place on in
Thursday 10am-12pm in AB4 SR 114

The course is planned in two parts.

Part 1 : (until approx. end of May) We will present the CUDA programming language, and the associated execution and memory models.
Part 2 : (after approx. end of May) Project Time.

The people responsible for the course are
Jens K. Müller
Thomas Baumbach
Public hours are on
Wednesdays : 2pm-4pm Ernst-Abbe-Platz 2 (room 3335)



The NVIDIA website provides numerous resources for CUDA. A few starting points are (links to university courses on CUDA) (documentation, programming guide)



Lecture slides borrow heavily from the NVIDIA CUDA C Programming Guide v3.2 and the CUDA book.
The code presented in the lecture is available at

Monday, 4th Apr, 2011 Organization
Tuesday, 5th Apr, 2011 Introduction and Hello World
Monday, 11th Apr, 2011
CUDA Programming Model
Tuesday, 12th Apr, 2011
Error Handling, Managing Global and Shared Memory
Monday, 18th Apr, 2011 Using Global, Shared, and Constant Memory
Tuesday, 19th Apr, 2011 Synchronization and Texture Memory
Monday, 25th Apr, 2011 Holiday
Tuesday, 26th Apr, 2011 Surface Memory, Page-Locked Memory, and OpenGL
Monday, 2nd May, 2011 Hands on Surface and Texture Memory
Tuesday, 3rd May, 2011 Streams, Events, and Scheduling
Monday, 9th May, 2011 Hands on Streams and CUDA Profiler
Tuesday, 10th May, 2011 Device Utilization and Global Memory Optimizations
Monday, 16th May, 2011 Memory and Instruction Optimizations
Tuesday, 17th May, 2011 Kernel Specifics for 2.x
"Optimizing Parallel Reduction in CUDA" by Mark Harris
Monday, 23th May, 2011


Tuesday, 24th May, 2011

Best Practice

Monday, 30th May, 2011

Analyzing parallel algorithms

Tuesday, 31th May, 2011 Project Organization
Thurday, 7th July, 2011
Introduction to the Projects



Tutorial slides (updated after each tutorial/exercise)

Thur, 7th Apr, 2011
Exercise No. 0 slides: pdf material: n/a
Thur, 14th Apr, 2011
Exercise No. 1 (v4) slides: pdf material: n/a
Thur, 21th Apr, 2011 Exercise No. 2 slides: pdf material: raytracer_v2.tar.gz & result.ppm
Thur, 28th Apr, 2011 Exercise No. 3 slides: pdf material: scene_angle_v2.yaml & squirrel.yaml
Thur, 5th May, 2011 Exercise No. 4 slides: pdf material: tetraeder_v2.yaml t1.yaml t1.ppm t2.yaml t2.ppm
Thur, 12th May, 2011 Exercise No. 5 slides: pdf material: tetraeder_new.yaml double-torus.yaml
Thur, 19th May, 2011 Exercise No. 6 slides: pdf material: n/a
Thur, 26th May, 2011 Exercise No. 7 slides: pdf material: n/a



To get an overview what people have been doing using CUDA see here.

Weekly meeting
Additional Resources
Matlab plugin
Tue, 2pm

Shallow water waves
Tue, 3pm
swe1.mp4 swe2.mp4
Longest Common Subsequence
Tue, 4pm
Ant System
Tue, 5pm
Panic Simulator
Thu, 11am

Your project has to be submitted until Friday, 8th of July 2011. We will checkout your project using the above repository. Your project has to be ready to compile.

Project presentations will take place on Thursday, 7th of July 2011 in EAP2 3325 from 10am to 12pm. Each project is assigned 10-15 minutes. You should leave some time for audience questions and discussion. We suggest the following outline for the presentation as a rough guide

- Problem description
- Challenges: CPU vs GPU implementation
- Results
- Demo

The presentations are scheduled in following order.

Panic Simulator
Ant System
Longest Common Subsequence
Shallow water waves
Matlab plugin

To ensure smooth and quick transition between different presentations, we advice you to check your setup beforehand. For the presentations there will be a laptop running Linux wired to the university network. For running your demo you can access one of the CUDA machines.



Monday, 18th July 2011

11:00 - 11:30
Matthias Keil
11:30 - 12:00
Manuel Amthor

14:00 - 14:30
Martin Pfeiffer
14:30 - 15:00
Martin Hartmann
15:00 - 15:30
Daniel Kirbst
15:30 - 16:00 Johannes Schmidt