10.7_1D_Surface_Read_Write
10.7 1D Surface Read/Write
Until SM 2.0 hardware became available, CUDA kernels could access the contents of CUDA arrays only via texturing. Other access to CUDA arrays, including all write access, could be performed only via memcpy functions such asudaMemcpyToArray(). The only way for CUDA kernels to both texture from and write to a given region of memory was to bind the texture reference to linear device memory.
But with the surface read/write functions newly available in SM 2.x, developers can bind CUDA arrays to surface references and use the surf1Dread() and surf1Dwrite() intrinsics to read and write the CUDA arrays from a kernel. Unlike texture reads, which have dedicated cache hardware, these reads and writes go through the same L2 cache as global loads and stores.
NOTE
In order for a surface reference to be bound to a CUDA array, the CUDA array must have been created with theCORDArraySurfaceLoadStore flag.
The 1D surface read/write intrinsics are declared as follows.
template<class Type> Type surf1Dread(surface<void, 1> surfRef, int x, boundaryMode =EMENTBoundaryModeTrap);
template<class Type> void surf1Dwrite(Type data, surface<void, 1> surfRef, int x, boundaryMode =EMENTBoundaryModeTrap);
These intrinsics are not type-strong—as you can see, surface references are declared as void—and the size of the memory transaction depends on sizeof (Type) for a given invocation of surf1Dread() or surf1Dwrite(). The x offset is in bytes and must be naturally aligned with respect to sizeof (Type). For 4-byte operands such as int or float, offset must be evenly divisible by 4, for short it must be divisible by 2, and so on.
Support for surface read/write is far less rich than texturing functionality. Only unformatted reads and writes are supported, with no conversion or interpolation functions, and the border handling is restricted to only two modes.
Boundary conditions are handled differently for surface read/write than for texture reads. For textures, this behavior is controlled by the addressing mode in the texture reference. For surface read/write, the method of handling out-of-range offset values is specified as a parameter of surf1Dread() or surf1Dwrite(). Out-of-range indices can either cause a hardware exception (cudaBoundaryModeTrap) or read as 0 for surf1Dread() and are ignored for surf1Dwrite() (cudaBoundaryModeZero).
Because of the untyped character of surface references, it is easy to write a templated 1D memset routine that works for all types.
```c
surfacevoid,1>surf1D;