Codebase Architecture#
cellucid-r is intentionally small. Most of the implementation lives in one file.
Repository layout (high level)#
cellucid-r/R/cellucid_prepare.Rexports
cellucid_prepare()andprepare()contains the exporter implementation and helper functions
cellucid-r/man/cellucid_prepare.RdR help page generated/maintained for Bioconductor-style docs
cellucid-r/tests/testthat/unit tests validating core files, normalization, quantization, connectivity, vector fields
cellucid-r/vignettes/cellucid.Rmdminimal vignette showing a small export workflow
cellucid-r/PUBLISHING.mdrelease/publishing checklist
Data export pipeline (what happens in cellucid_prepare())#
At a high level:
Validate embeddings and infer
n_cells.Normalize embeddings (center + scale) and write
points_*d.bin.Validate/convert
latent_spaceandobs.Export optional vector fields (scaled with embedding normalization).
Export obs:
continuous values (float32 or quantized)
categorical codes (uint8/uint16) + outlier quantiles (latent-space)
centroids (embedding-space)
write
obs_manifest.json
Export gene expression (optional):
validate
varandgene_expressionwrite one dense vector per gene under
var/write
var_manifest.json
Export connectivities (optional):
symmetrize and binarize
write edge pairs under
connectivity/write
connectivity_manifest.json
Write
dataset_identity.json(summary + pointers to files).
The user-guide docs mirror this structure:
Design principles#
Minimal dependencies (only
jsonliterequired).Deterministic, file-based exports.
Shared format with the Python exporter.
Adding a new exported artifact (maintainer notes)#
If you add a new feature that writes files:
Decide where it belongs:
dataset_identity.json(top-level discovery)a dedicated manifest JSON
a new subdirectory of binaries
Add tests under
cellucid-r/tests/testthat/.Update the user guide format spec: