In the rapidly evolving field of deep learning, Convolutional Neural Networks (CNNs) retain their unique strengths and applicability in processing grid-structured data such as images, despite the surge of Transformer architectures. This paper explores alternatives to the standard convolution, with the objective of augmenting its feature extraction prowess while maintaining a similar parameter count. We propose innovative solutions targeting depthwise separable convolution and standard convolution, culminating in our Multi-scale Progressive Inference Convolution (MPIC). MPIC incorporates the benefits of large receptive fields, multi-scale processing, and gradual inference. Our alternative Approach are not only compatible with existing convolutional variant networks such as MobileNet, ResNet, and ResNest, but also significantly enhance feature extraction capabilities while retaining computational efficiency. Comprehensive experiments on several renowned datasets and in-depth comparisons with standard convolution validate the efficacy of our proposals. The results exhibit significant performance enhancements with our convolutional alternatives. Detailed ablation studies further corroborate the effectiveness of our proposed solutions in various computer vision tasks, including object detection, class activation mapping, and salient object detection, etc.
Keywords: Convolution; Multi-scale; Neural networks; Progressive inference.
Copyright © 2024 Elsevier Ltd. All rights reserved.