John Mellor-Crummey
Department of Computer ScienceRice University johnmc@rice.edu
Programming GPUs with CUDA
COMP 422 Lecture 21 12 April 2011
Why GPUs?
•
Two major trends
—GPU performance is pulling away from traditional processors
–
~10x memory bandwidth & floating point ops
—availability of general (non-graphics) programming interfaces
•
GPU in every PC and workstation
—massive volume, potentially broad impact
2
Figure Credit: NVIDIA CUDA Compute Unified Device Architecture Programming Guide 2.0
NVidia Tesla GPU
3
Figure Credit: http://images.nvidia.com/products/tesla_c870/Tesla_C870_F_med.png
Similar Tesla S870 server inbadlands.rcsg.rice.edu(installed March, 2008)
Tesla (G80)Tesla2 (GT200)CUDA Cores128240Processor Clock1.69 GHz1.47 GHzFloating Point PrecisionIEEE 754 SPIEEE 754 DPDedicated Memory512 MB1 GB GDDR3Memory Clock (MHz)1.1 GHz1.2 GHzMemory Interface Width256-bit512-bitMemory Bandwidth70.4 GB/s159 GB/s
GPGPU?
•
General Purpose computation using GPU
—applications beyond 3D graphics—typically, data-intensive science and engineering applications
•
Data-intensive algorithms leverage GPU attributes
—large data arrays, streaming throughput—fine-grain SIMD parallelism—low-latency floating point computation
4