Hardware Efficient PDE Solvers in Quantized Image Processing
Performance and accuracy of scientific computations are competing aspects. A close interplay between the design of computational schemes and their implementation can improve both aspects by making better use of the available resources. The thesis describes the design of robust schemes under strong quantization and their hardware efficient implementation on data-stream-based architectures for PDE based image processing. The strong quantization improves execution time, but renders traditional error estimates useless. The precision of the number formats is too small to control the quantitative error in iterative schemes. Instead, quantized schemes which preserve the qualitative behavior of the continuous models are constructed. In particular for the solution of the quantized anisotropic diffusion model one can derive a quantized scale-space with almost identical properties to the continuous one. Thus the image evolution is accurately reconstructed despite the inability to control the error in the long run, which is difficult even for high precision computations. All memory intensive algorithms are, nowadays, burdened with the memory gap problem which degrades performance enormously. The instruction-stream-based computing paradigm reenforces this problem, whereas architectures subscribing to data-stream-based computing offer more possibilities to bridge the gap between memory and logic performance. Also more parallelism is available in these devices. Three architectures of this type are covered: graphics hardware, reconfigurable logic and reconfigurable computing devices. They allow to exploit the parallelism inherent in image processing applications and apply a memory efficient usage. Their pros and cons and future development are discussed. The combination of robust quantized schemes and hardware efficient implementations deliver an accurate reproduction of the continuous evolution and significant performance gains over standard software solutions. The applied devices are available on affordable AGP/PCI boards, offering true alternatives even to small multi-processor systems.