Reconstructing rapid natural vision with fMRI-conditional video generative adversarial network

Chong Wang; Hongmei Yan; Wei Huang; Jiyi Li; Yuting Wang; Yun-Shuang Fan; Wei Sheng; Tao Liu; Rong Li; Huafu Chen

doi:10.1093/cercor/bhab498

Reconstructing rapid natural vision with fMRI-conditional video generative adversarial network

Cereb Cortex. 2022 Oct 8;32(20):4502-4511. doi: 10.1093/cercor/bhab498.

Authors

Chong Wang^{1

2}, Hongmei Yan^{1

2}, Wei Huang¹, Jiyi Li¹, Yuting Wang¹, Yun-Shuang Fan¹, Wei Sheng¹, Tao Liu¹, Rong Li^{1

2}, Huafu Chen^{1

2

3}

Affiliations

¹ The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.
² MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 610054, China.
³ The Center of Psychosomatic Medicine, Sichuan Provincial Center for Mental Health, Sichuan Provincial People's Hospital, University of Electronic Science and Technology of China, Chengdu 611731, China.

PMID: 35078227
DOI: 10.1093/cercor/bhab498

Abstract

Recent functional magnetic resonance imaging (fMRI) studies have made significant progress in reconstructing perceived visual content, which advanced our understanding of the visual mechanism. However, reconstructing dynamic natural vision remains a challenge because of the limitation of the temporal resolution of fMRI. Here, we developed a novel fMRI-conditional video generative adversarial network (f-CVGAN) to reconstruct rapid video stimuli from evoked fMRI responses. In this model, we employed a generator to produce spatiotemporal reconstructions and employed two separate discriminators (spatial and temporal discriminators) for the assessment. We trained and tested the f-CVGAN on two publicly available video-fMRI datasets, and the model produced pixel-level reconstructions of 8 perceived video frames from each fMRI volume. Experimental results showed that the reconstructed videos were fMRI-related and captured important spatial and temporal information of the original stimuli. Moreover, we visualized the cortical importance map and found that the visual cortex is extensively involved in the reconstruction, whereas the low-level visual areas (V1/V2/V3/V4) showed the largest contribution. Our work suggests that slow blood oxygen level-dependent signals describe neural representations of the fast perceptual process that can be decoded in practice.

Keywords: conditional generative adversarial networks; fMRI; visual reconstruction.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Image Processing, Computer-Assisted / methods
Magnetic Resonance Imaging* / methods
Visual Cortex* / diagnostic imaging
Visual Cortex* / physiology