8.7 Instruction Sets

NVIDIA has developed three major architectures: Tesla (SM 1.x), Fermi (SM 2.x), and Kepler (SM 3.x). Within those families, new instructions have been added as NVIDIA updated their products. For example, global atomic operations were not present in the very first Tesla-class processor (the G80, which shipped in 2006 as the GeForce GTX 8800), but all subsequent Tesla-class GPUs included them. So when querying the SM version via cuDeviceComputeCapability(), the major and minor versions will be 1.0 for G80 and 1.1 (or greater) for all other Tesla-class GPUs. Conversely, if the SM version is 1.1 or greater, the application can use global atomics.

Table 8.16 gives the SASS instructions that may be printed by cuobj dump when disassembling microcode for Tesla-class (SM 1.x) hardware. The Fermi and Kepler instruction sets closely resemble each other, with the exception of the instructions that support surface load/store, so their instruction sets are given together in Table 8.17. In both tables, the middle column specifies the first SM version to support a given instruction.

Table 8.16 SM 1.x Instruction Set

Table 8.16 SM 1.x Instruction Set (Continued)

continues

Table 8.16 SM 1.x Instruction Set (Continued)

Table 8.17 SM 2.x and SM 3.x Instruction Sets

continues

Table 8.17 SM 2.x and SM 3.x Instruction Sets (Continued)

continues

Table 8.17 SM 2.x and SM 3.x Instruction Sets (Continued)

continues

Table 8.17 SM 2.x and SM 3.x Instruction Sets (Continued)

8.7_Instruction_Sets

8.7 Instruction Sets