Scalable optimal transport-based method for multimodal integration of RNA-seq and ATAC-seq data.
Installation
Option 1: pip (recommended)
Option 2: from source
We recommend creating a conda environment first:
conda env create -n scmint -f environment.yml
conda activate scmint
pip install -e .Note on PyTorch:
pip install scSAGAwill install a CPU-only version of PyTorch by default. If you need GPU support, install PyTorch manually with the appropriate CUDA version from pytorch.org before installing scSAGA.
Usage
Once installed, run the analysis from the command line:
Input YAML format
Create a YAML config file specifying your datasets and parameters. A template is provided in config/input.yml. More datasets can be added in the same format as needed.
anchor: "rna1" datasets: - name: "rna1" modality: "rna" counts: "/path/to/rna_normalized_counts.mtx" barcodes: "/path/to/rna_barcodes.txt" features: "/path/to/rna_features.txt" pca: "/path/to/rna_pca_50.txt" - name: "atac" modality: "atac" counts: "/path/to/atac_normalized_counts.mtx" barcodes: "/path/to/atac_barcodes.txt" features: "/path/to/atac_features.txt" pca: "/path/to/atac_pca_50.txt" # Add more datasets as needed: # - name: "rna2" # modality: "rna" # ... output_dir: "/path/to/output_directory" # sketch_size: # Optional: downsample cells via geometric sketching # --- Hyperparameters --- s_shared_cells: # Estimated number of shared cells across modalities M_samples: # Anchor pairs sampled per OT iteration alpha: # Update step size (0 to 1) S_iterations: # Number of SAGA iterations gw_epsilon: # Convergence threshold gw_reg: # Sinkhorn regularization strength
Outputs
Results are saved to the directory specified by output_dir:
T_<dataset>_to_<anchor>.npy— transport plan for each dataset pairjoint_embedding_2d.png— PCA plot of the joint embeddingjoint_embedding_2d.csv— 2D coordinates for the joint embeddingsaga_runtimes.txt— timing breakdown and alignment scores
Development
After cloning the repo:
git clone https://github.com/Swethasree-Bhattaram/scSAGA.git cd scSAGA conda env create -n scmint -f environment.yml conda activate scmint pip install -e .
The -e flag installs in editable mode, so changes to the source code take effect immediately without reinstalling.