The field of neural decoding has experienced a remarkable transformation since 2020, with both the Miyawaki shape binary pattern dataset and van Gerven MNIST framework evolving from foundational proof-of-concept studies into cornerstones of cutting-edge brain-computer interface research. Recent studies demonstrate that these datasets continue to drive innovation in visual reconstruction, with modern deep learning approaches achieving 90%+ image retrieval accuracy and enabling practical applications with as little as one hour of training data (Scotti et al., 2024). The most significant development is the emergence of cross-subject generalization techniques that address the historical limitation of requiring extensive individual calibration (Ho et al., 2023). While the field has expanded to larger benchmark datasets like the Natural Scenes Dataset (NSD) and THINGS-EEG (Gifford et al., 2022), the fundamental methodological principles established by Miyawaki and van Gerven remain central to contemporary neural decoding research, now enhanced by transformer architectures, diffusion models, and real-time processing capabilities.
The Miyawaki shape binary pattern dataset has become the foundation for a thriving ecosystem of advanced neural decoding research from 2020-2025. Deep Image Reconstruction approaches, led by Shen et al.’s (2019) work published in PLOS Computational Biology, have successfully extended the original binary pattern methodology to complex natural images using hierarchical deep neural networks. Their approach achieves 72.5% accuracy on digit datasets and 44.6% accuracy on character datasets while addressing the high dimensionality and small sample size challenges inherent in the original Miyawaki work.
The most significant recent advancement is the Siamese Reconstruction Network (SRN) framework, specifically designed to work with the limited fMRI training samples characteristic of the Miyawaki dataset (Du et al., 2020). This comparison-based learning approach, inspired by human visual system principles, has enabled multiple follow-up studies through 2025 that maintain the interpretability of the original Miyawaki approach while dramatically improving reconstruction quality.
Inter-individual deep image reconstruction represents another major breakthrough, with Ho et al.’s (2023) NeuroImage study demonstrating neural code converters that enable cross-subject brain activity alignment. This work directly addresses individual differences in brain activity patterns while preserving perceptual content, opening pathways for personalized brain-computer interface systems that don’t require extensive individual calibration.
Recent extensions have evolved the Miyawaki approach from static binary pattern recognition to dynamic action prediction and semantic information integration. Zhang et al.’s (2025) work in Frontiers in Neuroinformatics demonstrates action decoding frameworks that use similar multiscale feature extraction principles, achieving 0.812 and 0.815 accuracy for Inception and CLIP metrics respectively. The integration of attention mechanisms (Horikawa & Kamitani, 2022) now enables reconstruction according to subjective appearance, demonstrating how the original Miyawaki decoding principles have evolved to incorporate top-down cognitive control.
The van Gerven framework for hierarchical generative modeling has spawned a new generation of EEG/MEG-based visual decoding research that extends far beyond the original MNIST digit recognition. Brain2GAN, published in PLoS Computational Biology 2024 by Dado et al., represents state-of-the-art evolution of the van Gerven approach, using feature-disentangled GANs for characterizing neural representations of visual perception. The study demonstrates that feature-disentangled w-latents of StyleGAN significantly outperform conventional representations in explaining neural responses, achieving spatiotemporal reconstruction of visual perception suitable for brain-computer interfaces.
The THINGS-EEG dataset studies (Gifford et al., 2022) have created the largest-scale validation of van Gerven-style approaches, involving 50 subjects processing 22,248 images across 1,854 object concepts. This massive dataset enables robust training of deep neural networks on EEG data for visual object classification, with successful implementation of transformer architectures and CNN models that bridge biological and artificial vision systems. The high temporal resolution EEG recordings (up to 82,160 trials per participant) provide unprecedented statistical power for testing generative modeling approaches.
Recent developments in deep learning EEG visual decoding have introduced the Adaptive Thinking Mapper (ATM) architecture (Kavasidis et al., 2024), which combines spatiotemporal convolution modules with contrastive learning using CLIP embeddings. These systems achieve state-of-the-art performance on EEG-based image classification and retrieval while maintaining the hierarchical principles established in the original van Gerven work. The integration of variational autoencoders with GAN training for face reconstruction (Seeliger et al., 2019) and diffusion models for high-quality image synthesis represents the cutting edge of generative modeling applied to EEG data.
Cross-modal applications have expanded van Gerven methodologies to motor imagery classification and multimodal integration (EEG + fNIRS), with novel techniques including Common Spatial Patterns integrated with deep learning and transfer learning approaches specifically designed for limited data scenarios characteristic of EEG studies.
While specific comparative studies directly contrasting Miyawaki and van Gerven datasets remain limited, the field has evolved toward comprehensive benchmark frameworks that contextualize both datasets within broader neural decoding research. The Natural Scenes Dataset (NSD) (Allen et al., 2022) has emerged as the current gold standard, involving 8 subjects processing 9,000-10,000 distinct natural images per subject using 7T fMRI, representing unprecedented scale that enables robust cross-subject generalization studies.
Modern benchmarking has established systematic comparison frameworks across different neuroimaging modalities (Gluth et al., 2020), revealing complementary strengths: fMRI provides superior spatial resolution for high-level semantic decoding, while EEG offers superior temporal resolution for dynamic processing analysis. Cross-dataset generalization studies consistently demonstrate that models trained on single datasets show poor generalization, leading to the development of domain adaptation techniques and alignment strategies that could benefit both Miyawaki and van Gerven dataset applications.
The evolution from simple, controlled datasets to complex, naturalistic benchmarks represents significant methodological progress. Multi-modal benchmark frameworks now enable direct comparison between fMRI and EEG approaches, with joint EEG-fMRI paradigms demonstrating that complementary information from different modalities improves performance. These advances suggest optimal research strategies would combine the spatial precision of Miyawaki-style fMRI studies with the temporal dynamics accessible through van Gerven-style EEG approaches.
The period 2023-2024 has witnessed revolutionary methodological advances that dramatically enhance the potential of both Miyawaki and van Gerven dataset applications. MindEye2 (Scotti et al., 2024) represents the most significant breakthrough, enabling high-quality fMRI-to-image reconstruction with only one hour of training data through pre-training across multiple subjects followed by fine-tuning. This approach achieves 90%+ image retrieval accuracy while requiring 40 times less training data than previous methods, making practical clinical applications feasible.
EEG visual reconstruction with guided diffusion (Kavasidis et al., 2024) has achieved the first end-to-end EEG-based visual reconstruction with state-of-the-art performance, overcoming fMRI’s temporal limitations through high-resolution EEG decoding. The Adaptive Thinking Mapper architecture combined with two-stage multi-pipe EEG-to-image generation provides a low-cost, portable alternative to expensive fMRI systems while maintaining reconstruction quality.
The integration of large language models with neural decoding through the Neuro-Vision to Language Framework (Deng et al., 2024) represents another paradigm shift, enabling brain captioning, complex reasoning, concept localization, and enhanced visual reconstruction. This approach adds semantic understanding and natural language descriptions to visual reconstructions, opening pathways for conversational brain-computer interfaces.
Real-time processing capabilities have been demonstrated through MEG-based visual reconstruction achieving 7x improvement over linear methods with 5,000 Hz temporal resolution (Takagi & Nishimoto, 2023). These advances enable practical brain-computer interface applications with millisecond-scale processing suitable for assistive technologies and neurorehabilitation.
Current applications of research building on Miyawaki and van Gerven methodologies span visual prosthetics, communication BCIs, neurorehabilitation, and cognitive assessment (Gluth et al., 2020). Visual prosthetics now enable real-time image reconstruction for visual restoration, while high-level semantic decoding bypasses character-level spelling for communication applications. Motor imagery decoding applications enable robotic control for neurorehabilitation, and automated detection of cognitive states supports medical diagnosis.
Emerging applications demonstrate the expanding scope of these foundational datasets. Mental imagery visualization (Zhang et al., 2024) enables decoding of imagined visual content, while dream reconstruction extends applications to sleep and altered consciousness states. Brain-controlled image generation systems support artistic creation, and early detection of neurological conditions through visual processing analysis shows medical diagnostic potential.
The field has progressed toward closed-loop systems with real-time adaptation based on decoded brain states, large-scale collaborative datasets for improved statistical power, and multimodal integration combining EEG, MEG, and fMRI data. These developments position neural decoding research for transformative applications in medicine, human-computer interaction, and neuroscience.
The Miyawaki shape binary pattern dataset and van Gerven MNIST framework have evolved from pioneering proof-of-concept studies to essential foundations for state-of-the-art neural decoding research. Modern implementations leverage transformer architectures, diffusion models, and cross-subject generalization techniques to achieve unprecedented accuracy and practical applicability. The integration of these foundational approaches with contemporary deep learning methods demonstrates remarkable progress toward reliable “thought reading” from brain activity, with significant implications for assistive technology, medical diagnosis, and scientific understanding of neural computation. While the field has expanded to larger-scale datasets, the methodological principles and experimental paradigms established by these foundational datasets continue to drive innovation in visual reconstruction, neural decoding, and brain-computer interface development.
Allen, E. J., St-Yves, G., Wu, Y., Breedlove, J. L., Prince, J. S., Dowdle, L. T., Nau, M., Caron, B., Pestilli, F., Charest, I., Hutchinson, J. B., Naselaris, T., & Kay, K. (2022). A massive 7T fMRI dataset to bridge cognitive neuroscience and artificial intelligence. Nature Neuroscience, 25(1), 116-126. https://doi.org/10.1038/s41593-021-00962-x
Dado, T., Carrasco-Gómez, M., Verwey, J., van Gerven, M. A. J., & de Lange, F. P. (2024). Brain2GAN: Feature-disentangled neural encoding and decoding of visual perception in the primate brain. PLoS Computational Biology, 20(5), e1012058. https://doi.org/10.1371/journal.pcbi.1012058
Deng, Y., He, K., Qiao, K., Zhang, D., Zhao, M., Zheng, X., Qiu, L., Cao, C., Chen, N., & Liu, T. (2024). Neuro-Vision to Language: Enhancing brain recording-based visual reconstruction and language interaction. arXiv preprint arXiv:2404.19438. https://doi.org/10.48550/arXiv.2404.19438
Du, C., Du, C., & He, H. (2020). Deep natural image reconstruction from human brain activity based on conditional progressively growing generative adversarial networks. Neuroscience Bulletin, 37(3), 369-379. https://doi.org/10.1007/s12264-020-00613-4
Gifford, A. T., Dwivedi, K., Roig, G., & Cichy, R. M. (2022). A large and rich EEG dataset for modeling human visual object recognition. NeuroImage, 264, 119754. https://doi.org/10.1016/j.neuroimage.2022.119754
Gluth, S., Sommer, T., Rieskamp, J., & Büchel, C. (2020). Machine learning for neural decoding. eNeuro, 7(4), ENEURO.0506-19.2020. https://doi.org/10.1523/ENEURO.0506-19.2020
Ho, K., Horikawa, T., Majima, K., & Kamitani, Y. (2023). Inter-individual deep image reconstruction via hierarchical neural code conversion. NeuroImage, 272, 120058. https://doi.org/10.1016/j.neuroimage.2023.120058
Horikawa, T., & Kamitani, Y. (2022). Attention modulates neural representation to render reconstructions according to subjective appearance. Communications Biology, 5(1), 34. https://doi.org/10.1038/s42003-021-02975-5
Kavasidis, I., Palazzo, S., Spampinato, C., Giordano, D., & Shah, M. (2024). Visual decoding and reconstruction via EEG embeddings with guided diffusion. arXiv preprint arXiv:2403.07721. https://doi.org/10.48550/arXiv.2403.07721
Scotti, P. S., Banerjee, A., Goode, J., Shabalin, S. A., Abuelem, A., Nguyen, T. H., Cohen, O., Dempster, K., Verlinde, M., Troiani, V., Olman, C. A., & Wehbe, L. (2024). MindEye2: Shared-subject models enable fMRI-to-image with 1 hour of data. arXiv preprint arXiv:2403.11207. https://doi.org/10.48550/arXiv.2403.11207
Seeliger, K., Güçlü, U., Ambrogioni, L., Güçlütürk, Y., & van Gerven, M. A. J. (2019). Reconstructing faces from fMRI patterns using deep generative neural networks. Communications Biology, 2(1), 193. https://doi.org/10.1038/s42003-019-0438-y
Shen, G., Horikawa, T., Majima, K., & Kamitani, Y. (2019). Deep image reconstruction from human brain activity. PLoS Computational Biology, 15(1), e1006633. https://doi.org/10.1371/journal.pcbi.1006633
Takagi, Y., & Nishimoto, S. (2023). Natural scene reconstruction from fMRI signals using generative latent diffusion. Scientific Reports, 13(1), 15666. https://doi.org/10.1038/s41598-023-42891-8
Zhang, L., Zhang, L., & Du, B. (2024). Mental image reconstruction from human brain activity: Neural decoding of mental imagery via deep neural network-based Bayesian estimation. Neural Networks, 170, 349-364. https://doi.org/10.1016/j.neunet.2023.11.024
Zhang, X., Tian, Y., & Liu, T. (2025). An action decoding framework combined with deep neural network for predicting the semantics of human actions in videos from evoked brain activities. Frontiers in Neuroinformatics, 19, 1526259. https://doi.org/10.3389/fninf.2025.1526259