MIDAS: A Deep Generative Model for Mosaic Integration and Knowledge Transfer of Single-Cell Multimodal Data
MIDAS turns raw mosaic data into both imputed, batch-corrected data and disentangled latent representations, powering robust downstream analysis.
MIDAS is a powerful deep probabilistic framework designed for the mosaic integration and knowledge transfer of single-cell multimodal data. It addresses key challenges in single-cell analysis, such as modality alignment, batch effect removal, and data imputation. By leveraging self-supervised modality alignment and information-theoretic latent disentanglement, MIDAS transforms fragmented, mosaic data into a complete and harmonized dataset ready for downstream analysis.
Whether you are working with transcriptomics (RNA), proteomics (ADT), or chromatin accessibility (ATAC), MIDAS provides a versatile solution to uncover deeper biological insights from complex, multi-source datasets.
- Documentation: scmidas.readthedocs.io
- Publication: Nature Biotechnology
- Mosaic Data Integration: Seamlessly integrates datasets where different batches measure different sets of modalities (e.g., some samples have RNA and ATAC, while others have only RNA).
- Multi-Modal Support: Natively supports RNA, ADT, and ATAC data, and can be easily configured to incorporate additional modalities.
- Data Imputation: Accurately imputes missing modalities, turning incomplete data into a complete multi-modal matrix.
- Batch Correction: Effectively removes technical variations between different batches, enabling consistent and reliable analysis across datasets.
- Knowledge Transfer: Leverages a pre-trained reference atlas to enable flexible and accurate knowledge transfer to new query datasets.
- Efficient and Scalable: Built on PyTorch Lightning for highly efficient model training, with support for advanced strategies like Distributed Data Parallel (DDP).
- Advanced Visualization: Integrates with TensorBoard for real-time monitoring of training loss and UMAP visualizations.
Get started with MIDAS by setting up a conda environment.
# 1. Create and activate a new conda environment
conda create -n scmidas python=3.12
conda activate scmidas
# 2. Install MIDAS from PyPI
pip install scmidas
To get started, please refer to our documentation.
To reproduce the results from our publication, please visit the reproducibility
branch of this repository:
github.com/labomics/midas/tree/reproducibility
If you use MIDAS in your research, please cite our paper:
He, Z., Hu, S., Chen, Y. et al. Mosaic integration and knowledge transfer of single-cell multimodal data with MIDAS. Nat Biotechnol (2024). https://doi.org/10.1038/s41587-023-02040-y
@article{he2024mosaic,
title={Mosaic integration and knowledge transfer of single-cell multimodal data with MIDAS},
author={He, Zhen and Hu, Shuofeng and Chen, Yaowen and An, Sijing and Zhou, Jiahao and Liu, Runyan and Shi, Junfeng and Wang, Jing and Dong, Guohua and Shi, Jinhui and others},
journal={Nature Biotechnology},
pages={1--12},
year={2024},
publisher={Nature Publishing Group US New York}
}
We welcome contributions from the community! If you have a suggestion, bug report, or want to contribute to the code, please feel free to open an issue or submit a pull request.
MIDAS is available under the MIT License.