1.1_Our_Approach

1.1 Our Approach

CUDA is a difficult topic to write about. Parallel programming is complicated even without operating system considerations (Windows, Linux, MacOS), platform considerations (Tesla and Fermi, integrated and discrete GPUs, multiple GPUs), CPU/GPU concurrency considerations, and CUDA-specific considerations, such as having to decide between using the CUDA runtime or the driver API. When you add in the complexities of how best to structure CUDA kernels, it may seem overwhelming.

To present this complexity in a manageable way, most topics are explained more than once from different perspectives. What does the texture mapping hardware do? is a different question than How do I write a kernel that does texture mapping? This book addresses both questions in separate sections. Asynchronous