Data Preparation API (prepare/export) — The Big One#

This section is the exhaustive, no-surprises guide to exporting data for Cellucid using:

Cellucid is a web app. cellucid-python is a helper package you use from Python/CLI to:

This export format is designed to be:

  • easy to host/share (static files + manifests),

  • fast to load in the browser,

  • reproducible (explicit manifests + stable dataset identity),

  • and compatible with future helpers (e.g., a future cellucid-R exporter).

This section focuses on:

  • exact input requirements (shapes, dtypes, ordering),

  • on-disk outputs (what files get written and why),

  • performance knobs (compression, quantization, subsetting),

  • edge cases (NaN/Inf, huge categories, sparse vs dense),

  • and troubleshooting (symptom → diagnosis → fix).

Note

If you just want to visualize something quickly (no export deep dive yet), start with:

“What do I need to export?” (at a glance)#

Minimum viable export (fastest to produce):

  • latent_space + obs + at least one embedding (X_umap_2d or X_umap_3d) → interactive viewer with metadata fields.

Common “most useful” export:

  • add gene_expression + var → gene search + gene overlays.

Optional advanced features:

  • add connectivities → graph-based features (KNN edges).

  • add vector_fields → animated velocity/displacement overlays.

API reference (when you need exact signatures)#