CUDA Compute Capability

Compute capability is a version number assigned by NVIDIA to its various GPU architectures. It represents a set of hardware and software features supported by a particular GPU. GPUs with a higher compute capability number generally have more advanced features, more processing power, better efficiency, and the ability to execute the latest CUDA instructions and functionalities.

A numeric version (e.g., 7.5, 8.0, 8.6, 9.0) representing a GPU’s architecture features and supported instructions.
Determines which CUDA features your GPU supports.

Architecture	Compute Capability
Tesla	1.0 – 1.3
Kepler	3.0 – 3.7
Fermi	2.0 – 2.1
Maxwell	5.0 – 5.3
Pascal	6.0 – 6.2
Volta	7.0
Turing	7.5
Ampere	8.0 – 8.6
Ada Lovelace	8.9
Hopper	9.0
Blackwell	10.0
Blackwell	12.0

CUDA Driver

A kernel-level NVIDIA driver installed on the host OS.
Enables the OS to communicate with the GPU hardware.
Must be installed on your EC2 instance or host where we are running our GPU Workloads
The driver version must be >= minimum required version for the CUDA runtime you plan to use.

CUDA Toolkit

The CUDA Toolkit is NVIDIA’s official collection of tools, libraries, and compilers that developers use to write, build, and optimize GPU-accelerated applications. It includes the nvcc compiler for CUDA C/C++, runtime libraries like cuBLAS and cuDNN, and debugging and profiling tools. Essentially, the toolkit provides everything you need to develop software that runs efficiently on NVIDIA GPUs.

1️⃣ Compilers & Runtimes

▪  NVIDIA CUDA Compiler (NVCC)
▪  CUDA Runtime Library (cudart)
▪  CUDA Driver (User-mode component)
▪  PTX (Parallel Thread Execution) Assembler

2️⃣ Core Libraries (Highly Optimized for GPU)

▪  cuBLAS
▪  cuFFT
▪  cuRAND
▪  cuSPARSE
▪  cuSOLVER

3️⃣ Developer Tools

▪  NVIDIA Nsight Systems
▪  NVIDIA Nsight Compute
▪  CUDA GDB
▪  CUDA-MEMCHECK
▪  NVIDIA Visual Profiler
▪  CUDA-gdb-server

Backward & Forward Compatibility

🟢 GPU Architecture & Compute Capability

✅ Forward Compatibility(YES):

Code compiled for an older compute capability (7.5) can generally run on GPUs with a higher compute capability**(>7.5)**. However, it won’t take advantage of the newer features of the higher CC.

Code compiled for a newer compute capability will not run on GPUs with an older, unsupported compute capability.

❌ Backward Compatibility (NO):

Code compiled for a newer/higher Compute Capability (e.g., CC 8.6 Ampere) will NOT run on a GPU with an older/lower Compute Capability (e.g., CC 7.5 Turing).

🟢 CUDA Driver & CUDA Toolkit

Driver’s Perspective (Lack of Forward Compatibility)

- Forward Compatibility (NO for Driver / YES for Toolkit's output): An older CUDA Driver cannot run CUDA code compiled with a newer CUDA Toolkit.

- The older driver doesn't "know" about new APIs, features, or instruction sets introduced in the newer Toolkit.

Toolkit’s Perspective (Forward Compatibility of Compiled Code):

- Code compiled with an older CUDA Toolkit can generally run on systems with a newer CUDA Driver.

- Newer drivers are designed to be compatible with code generated by previous Toolkit versions. This ensures that existing applications continue to work when users update their drivers.

Driver’s Perspective (Backward Compatibility):

- A newer CUDA Driver can run CUDA code compiled with an older CUDA Toolkit.    

- The latest drivers are built to understand and execute the instruction sets and API calls from previous Toolkit versions.

Toolkit’s Perspective (Lack of Backward Compatibility of Compiled Code):

- Code compiled with a newer CUDA Toolkit will generally not run on systems with an older CUDA Driver.

- The newer Toolkit might generate code that requires features or driver APIs only present in the corresponding newer or future drivers

Important note: Starting with CUDA Toolkit 10.1 and NVIDIA driver R418, forward compatibility was introduced through the CUDA Compatibility Package. This allows older drivers to run applications built with newer CUDA Toolkits, enabling development teams to use updated toolkits — with new libraries, bug fixes, and optimizations — while still supporting existing older GPU architectures. Please Refer this documentation

CUDA Compute Capability, Driver, and Toolkit Compatibility Guide

CUDA Compute Capability

CUDA Driver

CUDA Toolkit

Backward & Forward Compatibility

🟢 GPU Architecture & Compute Capability

🟢 CUDA Driver & CUDA Toolkit

Driver’s Perspective (Lack of Forward Compatibility)

Toolkit’s Perspective (Forward Compatibility of Compiled Code):

Driver’s Perspective (Backward Compatibility):

Toolkit’s Perspective (Lack of Backward Compatibility of Compiled Code):

Resources

CUDA Compute Capability#

CUDA Driver#

CUDA Toolkit#

Backward & Forward Compatibility#

🟢 GPU Architecture & Compute Capability#

🟢 CUDA Driver & CUDA Toolkit#

Driver’s Perspective (Lack of Forward Compatibility)#

Toolkit’s Perspective (Forward Compatibility of Compiled Code):#

Driver’s Perspective (Backward Compatibility):#

Toolkit’s Perspective (Lack of Backward Compatibility of Compiled Code):#

Resources#

CUDA Compute Capability

CUDA Driver

CUDA Toolkit

Backward & Forward Compatibility

🟢 GPU Architecture & Compute Capability

🟢 CUDA Driver & CUDA Toolkit

Driver’s Perspective (Lack of Forward Compatibility)

Toolkit’s Perspective (Forward Compatibility of Compiled Code):

Driver’s Perspective (Backward Compatibility):

Toolkit’s Perspective (Lack of Backward Compatibility of Compiled Code):

Resources