Input Requirements (Global)#
Audience: computational users (recommended for everyone exporting real data)
Time: 10–15 minutes
Goal: prevent subtle exports that “work” but load incorrectly in the viewer.
This page documents the global rules that apply to every input you pass to cellucid_prepare().
Rule 1: cell identity is the row order#
Cellucid’s export format does not store separate cell IDs. Instead:
Cell
iis thei-th row in every exported array.
This means you must keep a consistent row order across:
X_umap_1d/X_umap_2d/X_umap_3dlatent_spaceobsgene_expressionrows (if provided)connectivitiesrows/cols (if provided)every vector field (if provided)
Practical strategy (recommended)#
If your source data has cell IDs (Seurat/SCE), pick a canonical ordering:
cell_ids <- colnames(seu) # Seurat
# or:
cell_ids <- colnames(sce) # SingleCellExperiment
Then reorder everything to match cell_ids explicitly before exporting.
Rule 2: required shapes#
At least one embedding is required:
Argument |
Shape |
|---|---|
|
|
|
|
|
|
And:
Argument |
Shape |
|---|---|
|
|
|
|
|
|
|
|
|
|
Warning
If you pass gene_expression in the common “genes × cells” orientation, export will fail (shape mismatch) or silently produce nonsense.
Always ensure cells × genes for Cellucid export.
Rule 3: no missing values in embeddings#
Embeddings are normalized using min()/max() and scaling.
If embeddings contain NA or Inf, normalization will produce invalid results.
Recommendation:
remove cells with missing embedding coordinates, or
impute before export (rarely recommended unless you know what you’re doing).
Rule 4: careful with non-numeric obs columns#
obs is a data.frame. cellucid-r classifies columns as:
continuous:
is.numeric(x)isTRUEcategorical: factors, logicals, and all other types (including character)
This means:
a character column like
"sample_id"becomes categorical (good),but a
Datecolumn becomes categorical (probably not what you want),and a numeric-looking character column becomes categorical unless you convert it.
Recommendation:
explicitly coerce
obscolumns you care about (as.numeric,factor, etc.)
Details: obs: Cell Metadata
Rule 5: avoid filename collisions#
To write files, cellucid-r turns some keys into filesystem-safe names:
obs field keys →
obs/<safe_key>.*gene IDs →
var/<safe_gene_id>.*
Unsupported characters become underscores. This is convenient, but can cause collisions:
Example:
Gene/1→Gene_1Gene_1→Gene_1(collision!)
Recommendation:
ensure
obscolumn names and gene IDs are unique after sanitization.cellucid-raborts export with an error if it detects a collision.
Quick “preflight” checks (copy/paste)#
Use this pattern before you export real data.
stopifnot(is.matrix(latent_space) || is.data.frame(latent_space))
stopifnot(is.data.frame(obs))
stopifnot(is.matrix(X_umap_2d) && ncol(X_umap_2d) == 2)
n_cells <- nrow(X_umap_2d)
stopifnot(nrow(latent_space) == n_cells)
stopifnot(nrow(obs) == n_cells)
if (!is.null(gene_expression)) {
stopifnot(nrow(gene_expression) == n_cells)
stopifnot(!is.null(var))
stopifnot(nrow(var) == ncol(gene_expression))
}
if (!is.null(connectivities)) {
stopifnot(nrow(connectivities) == n_cells)
stopifnot(ncol(connectivities) == n_cells)
}
If you have cell IDs, add explicit alignment checks:
stopifnot(!is.null(rownames(X_umap_2d)))
stopifnot(!is.null(rownames(latent_space)))
stopifnot(!is.null(rownames(obs)))
stopifnot(identical(rownames(X_umap_2d), rownames(obs)))
stopifnot(identical(rownames(latent_space), rownames(obs)))
Troubleshooting pointers#
Export fails with “n_cells mismatch” → Troubleshooting: Prepare/Export
Viewer loads but fields look wrong (often a row-order bug) → Export Format and Validation Issues