Export to Seurat
-
The query
object also contains methods for loading in results as a Seurat object (or any of Seurat’s component classes). As with the to_sparse_matrix()
method, we can specify the obs_index
and var_index
to use for naming the dimensions of the resulting object.
+
The query
object also contains methods for loading in
+results as a Seurat object (or any of Seurat’s component classes). As
+with the to_sparse_matrix()
method, we can specify the
+obs_index
and var_index
to use for naming the
+dimensions of the resulting object.
query <- SOMAExperimentAxisQuery$new(
experiment = experiment,
diff --git a/articles/soma-experiment-queries_files/accessible-code-block-0.0.1/empty-anchor.js b/articles/soma-experiment-queries_files/accessible-code-block-0.0.1/empty-anchor.js
deleted file mode 100644
index ca349fd6a5..0000000000
--- a/articles/soma-experiment-queries_files/accessible-code-block-0.0.1/empty-anchor.js
+++ /dev/null
@@ -1,15 +0,0 @@
-// Hide empty tag within highlighted CodeBlock for screen reader accessibility (see https://github.com/jgm/pandoc/issues/6352#issuecomment-626106786) -->
-// v0.0.1
-// Written by JooYoung Seo (jooyoung@psu.edu) and Atsushi Yasumoto on June 1st, 2020.
-
-document.addEventListener('DOMContentLoaded', function() {
- const codeList = document.getElementsByClassName("sourceCode");
- for (var i = 0; i < codeList.length; i++) {
- var linkList = codeList[i].getElementsByTagName('a');
- for (var j = 0; j < linkList.length; j++) {
- if (linkList[j].innerHTML === "") {
- linkList[j].setAttribute('aria-hidden', 'true');
- }
- }
- }
-});
diff --git a/articles/soma-objects.html b/articles/soma-objects.html
index 5ad66d6ee7..3f283f52c6 100644
--- a/articles/soma-objects.html
+++ b/articles/soma-objects.html
@@ -20,7 +20,7 @@
tiledbsoma
- 1.14.2
+ 1.14.99.1
@@ -118,7 +136,10 @@
SOMADataFrame
-
The obs
field contains a SOMADataFrame
, which is a multi-column table with a user-defined schema. The schema is expressed as an Arrow Schema, and defines the column names and value types.
+
The obs
field contains a SOMADataFrame
,
+which is a multi-column table with a user-defined schema. The schema is
+expressed as an Arrow Schema, and defines the column names and value
+types.
As an example, let’s inspect the schema of obs
:
experiment$obs$schema()
@@ -132,8 +153,14 @@
#> groups: string
#> RNA_snn_res.1: string
#> obs_id: string
-
Note that soma_joinid
is a field that exists in every SOMADataFrame
and acts as a join key for other objects in the dataset.
-
Again, when a SOMA object is accessed, only a pointer is returned and no data is read into memory. To load the data in memory, we call read()$concat()
, which returns an Arrow Table and is easily converted to a data frame by appending $to_data_frame()
.
+
Note that soma_joinid
is a field that exists in every
+SOMADataFrame
and acts as a join key for other objects in
+the dataset.
+
Again, when a SOMA object is accessed, only a pointer is returned and
+no data is read into memory. To load the data in memory, we call
+read()$concat()
, which returns an Arrow
+Table and is easily converted to a data frame by appending
+$to_data_frame()
.
experiment$obs$read()$concat()
#> Table
@@ -147,8 +174,15 @@
#> $groups <large_string>
#> $RNA_snn_res.1 <large_string>
#> $obs_id <large_string>
-
The amount of data that can be read at once is determined by the soma.init_buffer_bytes
configuration parameter, which, by default, is set to 16MB for each column. If the requested data is larger than this value an error will be thrown.
-
If your system has more memory, you can increase this parameter to a larger value to read in more data at once. Alternatively, you can use the iterated reader, which retrieves data in chunks that are smaller than the soma.init_buffer_bytes
parameter. The result of which is a list of Arrow Tables.
+
The amount of data that can be read at once is determined by the
+soma.init_buffer_bytes
configuration parameter, which, by
+default, is set to 16MB for each column. If the requested data is larger
+than this value an error will be thrown.
+
If your system has more memory, you can increase this parameter to a
+larger value to read in more data at once. Alternatively, you can use
+the iterated reader, which retrieves data in chunks that are smaller
+than the soma.init_buffer_bytes
parameter. The result of
+which is a list of Arrow Tables.
iterator <- experiment$obs$read()
iterator$read_next()
@@ -163,7 +197,10 @@
#> $groups <large_string>
#> $RNA_snn_res.1 <large_string>
#> $obs_id <large_string>
-
We can also select a subset of rows from the SOMADataFrame
using the coords
argument. This will retrieve only the required subset from disk to memory. In this example, we will select only the first 10 rows:
+
We can also select a subset of rows from the
+SOMADataFrame
using the coords
argument. This
+will retrieve only the required subset from disk to memory. In this
+example, we will select only the first 10 rows:
NOTE: The coords
argument is 0-based.
experiment$obs$read(coords = 0:9)$concat()
@@ -178,14 +215,16 @@
#> $groups <large_string>
#> $RNA_snn_res.1 <large_string>
#> $obs_id <large_string>
-
As TileDB is a columnar format, we can also select a subset of the columns:
+
As TileDB is a columnar format, we can also select a subset of the
+columns:
experiment$obs$read(0:9, column_names = c("obs_id", "nCount_RNA"))$concat()
#> Table
#> 10 rows x 2 columns
#> $obs_id <large_string>
#> $nCount_RNA <double>
-
Finally, we can use value_filter
to retrieve a subset of rows that match a certain condition.
+
Finally, we can use value_filter
to retrieve a subset of
+rows that match a certain condition.
experiment$obs$read(value_filter = "nCount_RNA > 100")$concat()
#> Table
@@ -199,84 +238,118 @@
#> $groups <large_string>
#> $RNA_snn_res.1 <large_string>
#> $obs_id <large_string>
-
And of course, you can combine all of these arguments together to get at only the data you need.
+
And of course, you can combine all of these arguments together to get
+at only the data you need.