We present a CUDA implementation of a complete registra-tion algorithm, which is capable of aligning two multimodal images, us-ing affine linear transformations and normalized gradient fields. Through the extensive use of different memory types, well handled thread man-agement and efficient hardware interpolation we gained fast executing code. Contrary to the common technique of reducing kernel calls, we significantly increased performance by rearranging a single kernel into multiple smaller ones. Our GPU implementation achieved a speedup of up to 11 compared to parallelized CPU code. Matching two 512 × 512 pixel images is performed in 37 milliseconds, thus making state-of-the-art multimodal image registration available in real time scenarios.
|Number of pages||10|
|Publication status||Published - 01.09.2014|
|Event||MICCAI 2014 Workshop on Deep Brain Stimulation Methodological Challenges - Boston, United States|
Duration: 14.09.2014 → 18.09.2014
|Conference||MICCAI 2014 Workshop on Deep Brain Stimulation Methodological Challenges|
|Period||14.09.14 → 18.09.14|