Tensor Decompositions for Neural Networks Compression
View/ Open
Date
2024-07-09Author
Sainz de la Maza Gamboa, Unai
Metadata
Show full item recordAbstract
As the demand for deploying machine learning models on resource-constrained devices grows, neural network compression has become an important area of research. Tensor decomposition is a promising technique for compressing neural networks, as it enables the representation of the network weights in a lower-dimensional format, while maintaining their accuracy and performance. In this work, we explore the application of tensor decomposition techniques, including Canonical Polyadic decomposition, Tucker decomposition, and Tensor Train decomposition, for neural network compression. We provide an exhaustive overview of the various tensor decomposition methods and compare their performance in terms of compression rates and accuracy. We implement and evaluate the different compression methods on the benchmark dataset CIFAR-10, using popular models such as ResNet and VGG. Our results show that tensor decomposition can significantly reduce the number of parameters of neural networks, while reducing minimally their accuracy. Finally, we discuss the challenges and opportunities of using tensor decomposition for neural network compression and highlight some open research questions in this field.