CUDA
CUDA is an architecture for GPUs developed by NVIDIA that was introduced on June 23, 2007. The name "CUDA" was originally an acronym for "Compute Unified Device Architecture," but the acronym has since been discontinued from official use.
CUDA improves the performance of computing tasks which benefit from parallel processing. These workloads, such as rendering 3D images in real-time, are often called "embarrassingly parallel" because they naturally lend themselves to being computed by individual cores. CUDA GPUs feature many of these CUDA cores, which may number in the thousands, integrated onto a single video card. Software must be written specifically for the architecture using low-level CUDA libraries and APIs (application programming interface), provided by NVIDIA. The native programming language of these libraries is C++, but wrappers are written for other languages, enabling CUDA processing in a wide array of applications.
Although first performed graphics-specific tasks, in 2012, the CUDA architecture transitioned to handling more general types of computation, such as mining cryptocurrency blockchains.
CUDA low-level APIs
Low-level APIs for performing specific tasks on the CUDA architecture include:
API | Description |
---|---|
cuBLAS | Basic linear algebra subroutines, accelerated for image analysis and machine learning. |
cudaRT | The runtime API, providing simplified management of initialization, threading contexts, and modules for CUDA applications. |
cuFFT | Fast Fourier Transforms, applicable to multiple scientific disciplines, accelerated to run up to 10x as fast as on a CPU (central processing unit). |
cuRANDM | Pseudorandom number generation in bulk quantities. |
cuSOLVER | Accelerated "direct solvers," efficient algorithms for solving certain linear algebra applications. |
cuSPARSE | Subroutines for working with sparse matrices, which contain many zero-value elements. Accelerated to operate up to 5x faster than CPU implementations. |
NPP | NVIDIA Performance Primitives library for processing images, video, and other digital signals, up to 30x as fast as on a CPU. |
nvGRAPH | Accelerated implementations of graph analytics algorithms, including Google's famous PageRank algorithm. |
NVML | NVIDIA Management Library, enabling supervision and administration of multiple GPUs (graphics processing unit) performing CUDA tasks. |
NVRTC | Runtime compilation library, which converts strings of C++ code into CUDA code in real-time. |
PhysX | A scalable physics engine, supporting devices ranging from smartphones to high-end workstations. Integrated with existing third-party game engines such as Unreal Engine, Unity3D, and Stingray. |
Languages with CUDA wrappers
Programming languages (other than C++) that can create software for CUDA GPUs include:
Examples of CUDA GPUs
The following are examples of a range of NVIDIA GPUs, compared by number of CUDA cores, maximum frequency in MHz, memory in GB, and MSRP (Manufacturer's Suggested Retail Price) when released.
GPU name | CUDA cores | Max Frequency (MHz) | Memory (GB) | MSRP |
---|---|---|---|---|
GeForce GTX TITAN Z | 5760 | 876 | 12 | $1420 |
NVIDIA TITAN Xp | 3840 | 1582 | 12 | $1200 |
GeForce GTX 1080 | 2560 | 1733 | 8 | $499 |
GeForce GTX 980 | 2048 | 1216 | 4 | $550 |
GeForce GTX 960 | 1024 | 1178 | 2 | $230 |
GeForce GTX 750 | 512 | 1085 | 1 | $120 |
GeForce GT 430 | 96 | 700 | 1 | $60 |
3D, Graphic, Hardware terms, Library, Machine learning, Video card, Workstation