BiocPy

BiocPy logo

BiocPy: Facilitate Bioconductor Workflows in Python

BiocPy brings Bioconductor's core data structures and analysis tools to the Python ecosystem. These structures, including BiocFrame and GenomicRanges, serve as essential and foundational data structures, acting as the building blocks for extensive and complex representations. For example, container classes like SummarizedExperiment, SingleCellExperiment, and MultiAssayExperiment represent single or multi-omic experimental data and metadata.

Core Packages

For a complete list of packages, visit our GitHub organization.

Data Structures

Package Description PyPI Links
BiocFrame Bioconductor-like data frames PyPI GitHub | Docs
IRanges Interval arithmetic operations PyPI GitHub | Docs | Bioconductor
GenomicRanges Genomic location analysis PyPI GitHub | Docs | Bioconductor

Containers

Package Description PyPI Links
SummarizedExperiment Genomic experiments container PyPI GitHub | Docs | Bioconductor
SingleCellExperiment Single-cell genomics container PyPI GitHub | Docs | Bioconductor
SpatialExperiment Spatial transcriptomics container PyPI GitHub | Docs | Bioconductor
SpatialFeatureExperiment Extends Spatial transcriptomics container PyPI GitHub | Docs | Bioconductor
MultiAssayExperiment Multi-omics data framework PyPI GitHub | Docs | Bioconductor

R Interoperability

Package Description PyPI Links
rds2py Read RDS files directly in Python PyPI GitHub | Docs
BiocUtils Common utilities mirroring R's base functionality PyPI GitHub | Docs
mopsy Matrix operations with R-like syntax PyPI GitHub | Docs
pyBiocFileCache Resource caching system PyPI GitHub | Docs | Bioconductor
txdb Genome annotations from TxDB objects PyPI GitHub | Docs
orgdb Access OrgDb objects PyPI GitHub | Docs

Delayed Operations

Package Description PyPI Links
DelayedArray Delayed operations in Python PyPI GitHub | Docs | Bioconductor
HDF5Array HDF5-backed arrays PyPI GitHub | Docs | Bioconductor
TileDBArray TileDB-backed arrays PyPI GitHub | Docs | Bioconductor

Get Started

All packages in the BiocPy are published to BiocPy PyPI org. Install the core packages using the biocpy wrapper:

Individual packages can be installed separately. See each package's documentation for specific installation instructions.

Environments

We provide conda/mamba configuration files to create environments containing most BiocPy (& friends) packages. Check out the environments repository for more information.


Friends of BiocPy

BiocPy integrates with several analysis tools and frameworks

Analysis Tools

  • libscran: Multi-model single-cell analysis in R, Python and JavaScript.
  • SingleR-inc: Cell type annotation for single-cell data.

Data Management

  • ArtifactDB: Language-agnostic access to data across computational environments.
  • tatami-inc: Read various matrix representations through a common interface.

Model Training

  • CellArr: TileDB-based genomic data storage with AI/ML dataloaders.

Contributing

We welcome contributions! Check out our developer guide to get started.