5.5_Texture_Memory

5.5 Texture Memory

In CUDA, the concept of texture memory is realized in two parts: a CUDA array contains the physical memory allocation, and a texture reference or surface reference19 contains a "view" that can be used to read or write a CUDA array. The CUDA array is just an untyped "bag of bits" with a memory layout optimized for 1D, 2D, or 3D access. A texture reference contains information on how the CUDA array should be addressed and how its contents should be interpreted.

When using a texture reference to read from a CUDA array, the hardware uses a separate, read-only cache to resolve the memory references. While the kernel is executing, the texture cache is not kept coherent with respect to the rest of the memory subsystem, so it is important not to use texture references to alias memory that will be operated on by the kernel. (The cache is invalidated between kernel launches.)

On SM 3.5 hardware, reads via texture can be explicitly requested by the developer using the const restricted keywords. The restricted keyword does nothing more than make the just-described "no aliasing" guarantee that the memory in question won't be referenced by the kernel in any other way. When reading or writing a CUDA array with a surface reference, the memory traffic goes through the same memory hierarchy as global loads and stores. Chapter 10 contains a detailed discussion of how to allocate and use textures in CUDA.