Skip to content

Latest commit

 

History

History
39 lines (26 loc) · 2.15 KB

File metadata and controls

39 lines (26 loc) · 2.15 KB

Overview

We expect a combine_embeddings HDF5 group at the root of the file, describing how embeddings are combined across multiple modalities. The group itself contains the parameters and results subgroups.

Only a single embedding was generated prior to version 2.0 of the format, so the combine_embeddings group may be absent in pre-v2.0 files. No concept of multi-modality exists in earlier versions so downstream steps should use the PCs directly from the pca step.

Definitions:

  • modalities: an array of names of modalities with available embeddings. This is based on the availability of embeddings in the pca and adt_pca steps
  • num_cells: number of cells remaining after QC filtering. This is typically determined from the cell_filtering step.
  • total_dims: the sum of the number of dimensions of the embedding across all modalities in modalities.

Parameters

parameters should contain:

  • approximate: an integer scalar to be interpreted as a boolean, indicating whether an approximate neighbor search was used to compute the per-embedding scaling factors.
  • weights: a group containing the (scaling) weights to apply to each modality. If empty, weights are implicitly assumed to be equal to unity for all modalities. Otherwise, the group should contain a float scalar dataset named after each modality in modalities, containing the weight to be applied to each modality.

Results

If multiple modalities are present in modalities, results should contain:

  • combined: a 2-dimensional float dataset containing the combined embeddings in a row-major layout. Each row corresponds to a cell and each column corresponds to a dimension, with num_cells rows and total_dims columns in total.

Otherwise, combined may be missing, in which case it is assumed that only a single embedding exists in the analysis and no combining is necessary. Downstream steps should instead use the coordinates from the PCA group of the available modality in the pca or adt_pca step.

History

Added in version 2.0.