Abstract
Special hardware accelerators like FPGAs and GPUs are commonly introduced into a computing system as a separate device. Consequently, the accelerator and the host system do not share a common memory. Sourcing out the data to the additional hardware thus introduces a communication penalty. Based on a combination of a program's source code and execution profiling we perform an analysis which evaluates the arithmetic intensity as a cost function to identify those parts most reasonable to source out to the accelerating hardware. The basic principles of this analysis are introduced and tested with a sample application. Its concrete results are discussed and evaluated based on the performance of a FPGA-based and a GPU-based implementation.
Original language | English |
---|---|
Title of host publication | ARCS 2011: Architecture of Computing Systems - ARCS 2011 |
Number of pages | 12 |
Volume | 6566 |
Publisher | Springer Verlag |
Publication date | 02.03.2011 |
Pages | 1-12 |
ISBN (Print) | 978-3-642-19136-7 |
ISBN (Electronic) | 978-3-642-19137-4 |
DOIs | |
Publication status | Published - 02.03.2011 |
Event | 24th International Conference on Architecture of Computing Systems - Como, Italy Duration: 24.02.2011 → 25.02.2011 Conference number: 83943 |