Data Preparation API (prepare/export) — The Big One#

This section is the exhaustive, no-surprises guide to exporting data for Cellucid using:

prepare() (Python API)

Cellucid is a web app. cellucid-python is a helper package you use from Python/CLI to:

export datasets into a high-performance folder format (what this section covers),
or serve/view data directly (see Viewing APIs (serve / serve_anndata / show / show_anndata + loading options)).

This export format is designed to be:

easy to host/share (static files + manifests),
fast to load in the browser,
reproducible (explicit manifests + stable dataset identity),
and compatible with future helpers (e.g., a future cellucid-R exporter).

This section focuses on:

exact input requirements (shapes, dtypes, ordering),
on-disk outputs (what files get written and why),
performance knobs (compression, quantization, subsetting),
edge cases (NaN/Inf, huge categories, sparse vs dense),
and troubleshooting (symptom → diagnosis → fix).

Note

If you just want to visualize something quickly (no export deep dive yet), start with:

Quick start (3 levels) (choose your level), or
Viewing methods overview (all viewing modes).

Recommended reading order#

“What do I need to export?” (at a glance)#

Minimum viable export (fastest to produce):

latent_space + obs + at least one embedding (X_umap_2d or X_umap_3d) → interactive viewer with metadata fields.

Common “most useful” export:

add gene_expression + var → gene search + gene overlays.

Optional advanced features:

add connectivities → graph-based features (KNN edges).
add vector_fields → animated velocity/displacement overlays.

API reference (when you need exact signatures)#

Export / Data Preparation (moved) (includes prepare() + docstring/autodoc)