- ⸻ 2026-06-09
Simpler queries for the 2.5B transcriptional profiles of the Arc Virtual Cell Atlas
With 2.5B expression profiles that map to about 600M cells, the Arc Virtual Cell Atlas offers the world’s largest collection of uniformly processed scRNA-seq datasets. Arc Institute distributes the atlas as 460k parquet and h5ad files totaling 41TB on Google Cloud Storage. We present a database mirror that offers queries by entities, a graphical user interface, and zero-copy, lineage-aware sharing of datasets.
- ⸻ 2026-03-02
Interactive visualization of multimodal and spatial data with Vitessce
The open-source tool Vitessce and Lamin now work together to manage & visualize multimodal and spatial single-cell data. It’s simple: define a Vitessce config in code, save it as an artifact, and share the interactive visualization along with your datasets on LaminHub.
- ⸻ 2024-04-03
MappedCollection: Weighted random sampling from large collections of scRNA-seq datasets
A few labs and companies now train models on large-scale scRNA-seq count matrices and related data modalities. But unlike for many other data types, there isn’t yet a playbook for data scales that don’t fit into memory.