Understanding Convolutional Neural Networks With Information Theory: An Initial Exploration

IEEE Trans Neural Netw Learn Syst. 2021 Jan;32(1):435-442. doi: 10.1109/TNNLS.2020.2968509. Epub 2021 Jan 4.

Abstract

A novel functional estimator for Rényi's α -entropy and its multivariate extension was recently proposed in terms of the normalized eigenspectrum of a Hermitian matrix of the projected data in a reproducing kernel Hilbert space (RKHS). However, the utility and possible applications of these new estimators are rather new and mostly unknown to practitioners. In this brief, we first show that this estimator enables straightforward measurement of information flow in realistic convolutional neural networks (CNNs) without any approximation. Then, we introduce the partial information decomposition (PID) framework and develop three quantities to analyze the synergy and redundancy in convolutional layer representations. Our results validate two fundamental data processing inequalities and reveal more inner properties concerning CNN training.