CUDA Compute Capability
Compute capability is a version number assigned by NVIDIA to its various GPU architectures. It represents a set of hardware and software features supported by a particular GPU. GPUs with a higher compute capability number generally have more advanced features, more processing power, better efficiency, and the ability to execute the latest CUDA instructions and functionalities.
- A numeric version (e.g., 7.5, 8.0, 8.6, 9.0) representing a GPU’s architecture features and supported instructions.
- Determines which CUDA features your GPU supports.
Architecture | Compute Capability |
---|---|
Tesla | 1.0 – 1.3 |
Kepler | 3.0 – 3.7 |
Fermi | 2.0 – 2.1 |
Maxwell | 5.0 – 5.3 |
Pascal | 6.0 – 6.2 |
Volta | 7.0 |
Turing | 7.5 |
Ampere | 8.0 – 8.6 |
Ada Lovelace | 8.9 |
Hopper | 9.0 |
Blackwell | 10.0 |
Blackwell | 12.0 |
CUDA Driver
- A kernel-level NVIDIA driver installed on the host OS.
- Enables the OS to communicate with the GPU hardware.
- Must be installed on your EC2 instance or host where we are running our GPU Workloads
- The driver version must be >= minimum required version for the CUDA runtime you plan to use.
CUDA Toolkit
The CUDA Toolkit is NVIDIA’s official collection of tools, libraries, and compilers that developers use to write, build, and optimize GPU-accelerated applications. It includes the nvcc compiler for CUDA C/C++, runtime libraries like cuBLAS and cuDNN, and debugging and profiling tools. Essentially, the toolkit provides everything you need to develop software that runs efficiently on NVIDIA GPUs.
1️⃣ Compilers & Runtimes
▪ NVIDIA CUDA Compiler (NVCC)
▪ CUDA Runtime Library (cudart)
▪ CUDA Driver (User-mode component)
▪ PTX (Parallel Thread Execution) Assembler
2️⃣ Core Libraries (Highly Optimized for GPU)
▪ cuBLAS
▪ cuFFT
▪ cuRAND
▪ cuSPARSE
▪ cuSOLVER
3️⃣ Developer Tools
▪ NVIDIA Nsight Systems
▪ NVIDIA Nsight Compute
▪ CUDA GDB
▪ CUDA-MEMCHECK
▪ NVIDIA Visual Profiler
▪ CUDA-gdb-server
Backward & Forward Compatibility
🟢 GPU Architecture & Compute Capability
✅ Forward Compatibility(YES):
Code compiled for an older compute capability (7.5) can generally run on GPUs with a higher compute capability**(>7.5)**. However, it won’t take advantage of the newer features of the higher CC.
Code compiled for a newer compute capability will not run on GPUs with an older, unsupported compute capability.
❌ Backward Compatibility (NO):
Code compiled for a newer/higher Compute Capability (e.g., CC 8.6 Ampere) will NOT run on a GPU with an older/lower Compute Capability (e.g., CC 7.5 Turing).
🟢 CUDA Driver & CUDA Toolkit
Driver’s Perspective (Lack of Forward Compatibility)
- Forward Compatibility (NO for Driver / YES for Toolkit's output): An older CUDA Driver cannot run CUDA code compiled with a newer CUDA Toolkit.
- The older driver doesn't "know" about new APIs, features, or instruction sets introduced in the newer Toolkit.
Toolkit’s Perspective (Forward Compatibility of Compiled Code):
- Code compiled with an older CUDA Toolkit can generally run on systems with a newer CUDA Driver.
- Newer drivers are designed to be compatible with code generated by previous Toolkit versions. This ensures that existing applications continue to work when users update their drivers.
Driver’s Perspective (Backward Compatibility):
- A newer CUDA Driver can run CUDA code compiled with an older CUDA Toolkit.
- The latest drivers are built to understand and execute the instruction sets and API calls from previous Toolkit versions.
Toolkit’s Perspective (Lack of Backward Compatibility of Compiled Code):
- Code compiled with a newer CUDA Toolkit will generally not run on systems with an older CUDA Driver.
- The newer Toolkit might generate code that requires features or driver APIs only present in the corresponding newer or future drivers