Humans struggle to "see" the structure in functional MRI (BOLD) brain maps. Our goal is to train a GPT that understands brain maps better than humans. This kind of "foundation" model should be useful for things like brain activity encoding and decoding. Plus it will hopefully generate neat fake brain maps.
Datasets. We train our models using the NSD-Flat Hugging Face dataset, which is derived from the Natural Scenes Dataset. This is a dataset of paired fMRI BOLD activation maps and natural images from COCO.
Models. All our models use a vanilla ViT architecture (Transformer
) adapted from timm. We consider a few different pre-training objectives:
-
Auto-regressive next patch prediction with shuffled patch order (
IGPT
). With either:- Discrete k-means token targets (
KMeansTokenizer
) and cross-entropy loss - Continuous patch targets and MSE loss
The shuffling idea is somewhat new, although see also SAIM and RandSAC. Our main innovation compared to these works is the use of a next position query embedding.
- Discrete k-means token targets (
Evaluation. We are primarily interested in two downstream tasks:
- Image-to-BOLD, i.e. fMRI encoding. See also the Algonauts 2023 challenge.
- BOLD-to-Image, i.e. fMRI image reconstruction. See also MindEye.
Clone the repo and install the package in a new virtual environment
git clone https://github.com/clane9/boldGPT.git
cd boldGPT
python3 -m venv --prompt boldgpt .venv
source .venv/bin/activate
pip install -U pip
pip install -r requirements.txt
pip install -e .
This project is under active development in collaboration with MedARC and we welcome contributions or feedback! If you'd like to contribute, please feel free fork the repo and start a conversation in our issues, or join us on the MedARC discord server.
If you find this repository helpful, please consider citing:
@misc{lane2023boldgpt,
author = {Connor Lane},
title = {boldGPT: A GPT foundation model for brain activity maps},
howpublished = {\url{https://github.com/clane9/boldGPT}},
year = {2023},
}