An efficient implementation of kernel density estimation for multi-core and many-core architectures
The International Journal of High Performance Computing Applications 29(3) : 331-347 (2015)
Abstract
Kernel density estimation (KDE) is a statistical technique used to estimate the probability density function of a sample set with unknown density function. It is considered a fundamental data-smoothing problem for use with large datasets, and is widely applied in areas such as climatology and biometry. Due to the large volumes of data that these problems usually process, KDE is a computationally challenging problem. Current HPC platforms with built-in accelerators have an enormous computing power, but they have to be programmed efficiently in order to take advantage of that power. We have developed a novel strategy to compute KDE using bounded kernels, trying to minimize memory accesses, and implemented it as a parallel program targeting multi-core and many-core processors. The efficiency of our code has been tested with different datasets, obtaining impressive levels of acceleration when taking as reference alternative, state-of-the-art KDE implementations.