Multiscale Spatial-Temporal Feature Fusion Neural Network for Motor Imagery Brain-Computer Interfaces

IEEE J Biomed Health Inform. 2024 Oct 1:PP. doi: 10.1109/JBHI.2024.3472097. Online ahead of print.

Abstract

Motor imagery, one of the main brain-computer interface (BCI) paradigms, has been extensively utilized in numerous BCI applications, such as the interaction between disabled people and external devices. Precise decoding, one of the most significant aspects of realizing efficient and stable interaction, has received a great deal of intensive research. However, the current decoding methods based on deep learning are still dominated by single-scale serial convolution, which leads to insufficient extraction of abundant information from motor imagery signals. To overcome such challenges, we propose a new end-to-end convolutional neural network based on multiscale spatial-temporal feature fusion (MSTFNet) for EEG classification of motor imagery. The architecture of MSTFNet consists of four distinct modules: feature enhancement module, multiscale temporal feature extraction module, spatial feature extraction module and feature fusion module, with the latter being further divided into the depthwise separable convolution block and efficient channel attention block. Moreover, we implement a straightforward yet potent data augmentation strategy to bolster the performance of MSTFNet significantly. To validate the performance of MSTFNet, we conduct cross-session experiments and leave-one-subject-out experiments. The cross-session experiment is conducted across two public datasets and one laboratory dataset. On the public datasets of BCI Competition IV 2a and BCI Competition IV 2b, MSTFNet achieves classification accuracies of 83.62% and 89.26%, respectively. On the laboratory dataset, MSTFNet achieves 86.68% classification accuracy. Besides, the leave-one-subject-out experiment is performed on the BCI Competition IV 2a dataset, and MSTFNet achieves 66.31% classification accuracy. These experimental results outperform several state-of-the-art methodologies, indicate the proposed MSTFNet's robust capability in decoding EEG signals associated with motor imagery.