14.9_References_and_Further_Reading

14.9 References and Further Reading

N-Body and algorithms with similarly high computational density are a source of many high-profile speedups, since they can approach the theoretical limits of the GPU's computing capabilities. The following are just a sample of the numerous papers on compute-intensive methods such as N-Body.

Gravitational Simulation

Burtscher, Martin, and Keshav Pingali. An efficient CUDA implementation of the tree-based Barnes-Hut n-body algorithm. In GPU Gems Emerald Edition, Wen-Mei Hwu, ed., Morgan-Kaufmann, 2011, Burlington, MA, pp. 75-92.

http://cs.txstate.edu/~burtscher/papers/gcg11.pdf

Harris, Mark, Lars Nyland, and Jan Prins. Fast n-body simulation with CUDA. In GPU Gems 3, Addison-Wesley, Boston, MA, 2007, pp. 677-695.
http developer.nvidia.com/GPUGems3/gpugems3_ch31.html

Molecular Modeling

Götz, Andreas, Mark J. Williamson, Dong Xu, Duncan Poole, Scott Le Grand, and Ross C. Walker. Routine microsecond molecular dynamics simulations with AMBER on GPUs—Part I: Generalized Born, J. Chem. Theory Comput. 8 (5), 2012, pp. 1542-1555.
Hwu, Wen-Mei, and David Kirk. Programming Massively Parallel Processors. Morgan-Kaufmann, 2010, pp. 173-188.
Hardy, David J., John E. Stone, Kirby L. Vandivort, David Gohara, Christopher Rodrigues, and Klaus Schulten. Fast molecular electrostatics algorithms on GPUs. In GPU Computing Gems, Elsevier, Burlington, MA, 2011, pp. 43-58.
Stone, John E., James C. Phillips, Peter L. Freddollino, David J. Hardy, Leonardo G. Trabuco, and Klaus Schulten. Accelerating molecular modeling applications with graphics processors. Journal of Computational Chemistry 28 (2007), pp. 2618-2640.
http://cacs.usc.edu/education/cs653/Stone-MDGPU-JCC07.pdf
Stone, John E., David J. Hardy, Barry Isralewitz, and Klaus Schulten. GPU algorithms for molecular modeling. In Scientific Computing with Multicore and Accelerators, Jack Dongarra, David A. Bader, and Jakob Kurzak, eds. Chapman & Hall/CRC Press, London, UK, 2010, pp. 351-371.

Boids

da Silva, A.R., W.S. Lages, and L. Chaimowicz. Boids that see: Using self-occlusion for simulating large groups on GPUs. ACM Comput. Entertain. 7 (4), 2009.
http://doi.acm.org/10.1145/1658866.1658870

This page intentionally left blank

Image Processing: Normalized Correlation

Normalized cross-correlation is a popular template-matching algorithm in image processing and computer vision. The template typically is an image that depicts a sought-after feature; by repeatedly computing a statistic between the template image and corresponding pixels of a subset of an input image, a search algorithm can locate instances of the template that are present in the input image.

The popularity of normalized cross-correlation for this application stems from its amplitude independence, which, in the context of image processing, essentially means that the statistic is robust in the face of lighting changes between the image and the template. Normalized correlation is popular enough, and sufficiently compute-intensive enough, that it has prompted companies to build custom hardware. This chapter develops an optimized implementation of normalized cross-correlation for 8-bit grayscale images, but many of the concepts can be extended to other types of image processing or computer vision algorithms.