Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pre-commit hook for prettier and format most files #134

Merged
merged 1 commit into from
Feb 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/python-somacore.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ jobs:
with:
cache: pip
cache-dependency-path: python-spec/requirements-py3.10.txt
python-version: '3.10'
python-version: "3.10"
- name: Set up environment
run: |
pip install --upgrade build pip wheel setuptools setuptools-scm
Expand Down
29 changes: 20 additions & 9 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
# Start with the basic pre-commit hooks

- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v3.2.0
Expand All @@ -10,21 +11,31 @@ repos:
- id: check-yaml
- id: check-added-large-files

# This mypy step is not perfect; in the interest of not reproducing the entire
# dependency list here we only install `attrs`. It will catch a useful subset
# of errors but does not replace a full mypy run (either locally or in CI).
# Then others in alphabetical order:

- repo: https://github.com/charliermarsh/ruff-pre-commit
rev: v0.0.243
hooks:
- id: ruff

- repo: https://github.com/pre-commit/mirrors-mypy
# This mypy step is not perfect; in the interest of not reproducing
# the entire dependency list here we only install `attrs`. It will catch
# a useful subset of errors but does not replace a full mypy run
# (either locally or in CI).
rev: v1.0.0
hooks:
- id: mypy
additional_dependencies: [attrs]

- repo: https://github.com/psf/black
rev: '22.12.0'
- repo: https://github.com/pre-commit/mirrors-prettier
rev: v2.7.1
hooks:
- id: black
- id: prettier
# For now we let this act on all files; if need be we can restrict it
# with `types_or` in the future.

- repo: https://github.com/charliermarsh/ruff-pre-commit
rev: v0.0.243
- repo: https://github.com/psf/black
rev: "22.12.0"
hooks:
- id: ruff
- id: black
38 changes: 17 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,16 @@
# SOMA

SOMA – for “Stack Of Matrices, Annotated” – is a flexible, extensible, and open-source API enabling access to data in a variety of formats. SOMA is designed to be general-purpose for data that can be modeled as one or more sets of 2D annotated matrices with measurements of features across observations.
SOMA – for “Stack Of Matrices, Annotated” – is a flexible, extensible, and open-source API enabling access to data in a variety of formats.
SOMA is designed to be general-purpose for data that can be modeled as one or more sets of 2D annotated matrices with measurements of features across observations.
The driving use case of SOMA is for single-cell data in the form of annotated matrices where observations are frequently cells and features are genes, proteins, or genomic regions.



## Motivation

Datasets generated by profiling single cells are rapidly increasing in size and complexity. This has resulted in a need for scalable solutions to accommodate data sizes that no longer fit in memory and flexibility to accommodate the diversity of data being produced.
Datasets generated by profiling single cells are rapidly increasing in size and complexity.
This has resulted in a need for scalable solutions to accommodate data sizes that no longer fit in memory and flexibility to accommodate the diversity of data being produced.

To address these emerging needs in the single cell ecosystem, the Chan Zuckerberg Initiative in partnership with TileDB is:


1. Driving the development of SOMA.
2. Providing its first implementation, [TileDB-SOMA](https://github.com/single-cell-data/TileDB-SOMA) which utilizes the [TileDB Embedded](https://github.com/TileDB-Inc/TileDB) engine.
3. Adopting TileDB-SOMA at [CZ CELLxGENE Discover](https://cellxgene.cziscience.com/) to build the [Cell Census](https://github.com/chanzuckerberg/cell-census/) which provides efficient access and querying to a corpus containing nearly 50 million cells, compiled from 700+ datasets.
Expand All @@ -20,32 +19,29 @@ The `SOMA` specification and its `TileDB-SOMA` implementation provide the follow

1. An abstract specification with flexibility for data from multiple modalities (e.g. RNA, spatial, epigenomics)
1. A format to store and access datasets larger than memory, as compared to the current paradigm of `.h5ad`/`.mtx`/`.tgz`/`.RData`/`.h5Seurat`/ etc.
1. Eliminates in-memory limitations by providing query-ready data management for reading and writing at low latency and cloud scale.
1. Eliminates in-memory limitations by providing query-ready data management for reading and writing at low latency and cloud scale.
1. R and python APIs with the flexibility to expand to other languages.


## Developer information

* [SOMA abstract specification](https://github.com/single-cell-data/SOMA/blob/main/abstract_specification.md) — language-agnostic SOMA API specification.
* [Python SOMA specification](https://github.com/single-cell-data/SOMA/tree/main/python-spec) — persistence-layer–agnostic Python definition of SOMA core types.
* [TileDB-SOMA](https://github.com/single-cell-data/TileDB-SOMA) — Python and R implementation of SOMA specification using [TileDB Embedded](https://github.com/TileDB-Inc/TileDB). R coming soon.
- [SOMA abstract specification](https://github.com/single-cell-data/SOMA/blob/main/abstract_specification.md) — language-agnostic SOMA API specification.
- [Python SOMA specification](https://github.com/single-cell-data/SOMA/tree/main/python-spec) — persistence-layer–agnostic Python definition of SOMA core types.
- [TileDB-SOMA](https://github.com/single-cell-data/TileDB-SOMA) — Python and R implementation of SOMA specification using [TileDB Embedded](https://github.com/TileDB-Inc/TileDB). R coming soon.

## Coming soon!

* R SOMA specification and its implementation through TileDB-SOMA.
* End-user documentation for both Python and R TileDB-SOMA APIs, including a getting-started guide, notebooks, and API reference.
- R SOMA specification and its implementation through TileDB-SOMA.
- End-user documentation for both Python and R TileDB-SOMA APIs, including a getting-started guide, notebooks, and API reference.



## Issues and contacts

* We expect the TileDB-SOMA repository to be the front door for reporting and tracking implementation issues [https://github.com/single-cell-data/TileDB-SOMA/issues](https://github.com/single-cell-data/TileDB-SOMA/issues). In addition, for spec-related issues please submit an issue at [https://github.com/single-cell-data/SOMA/issues](https://github.com/single-cell-data/SOMA/issues).
* If you believe you have found a security issue, in lieu of filing an issue please responsibly disclose it by contacting [security@chanzuckerberg.com](mailto:security@chanzuckerberg.com).
* Feedback is appreciated, as this is a community-driven project. If you have well-scoped features/discussions please add them to [https://github.com/single-cell-data/SOMA/issues](https://github.com/single-cell-data/SOMA/issues). For any other inquiries please reach out to [soma@chanzuckerberg.com](mailto:soma@chanzuckerberg.com).
* If you would like to learn more about SOMA or would like to keep up to date with the latest developments, please join our mailing list [here](https://bit.ly/soma-signup).

- We expect the TileDB-SOMA repository to be the front door for reporting and tracking implementation issues [https://github.com/single-cell-data/TileDB-SOMA/issues](https://github.com/single-cell-data/TileDB-SOMA/issues). In addition, for spec-related issues please submit an issue at [https://github.com/single-cell-data/SOMA/issues](https://github.com/single-cell-data/SOMA/issues).
- If you believe you have found a security issue, in lieu of filing an issue please responsibly disclose it by contacting [security@chanzuckerberg.com](mailto:security@chanzuckerberg.com).
- Feedback is appreciated, as this is a community-driven project. If you have well-scoped features/discussions please add them to [https://github.com/single-cell-data/SOMA/issues](https://github.com/single-cell-data/SOMA/issues). For any other inquiries please reach out to [soma@chanzuckerberg.com](mailto:soma@chanzuckerberg.com).
- If you would like to learn more about SOMA or would like to keep up to date with the latest developments, please join our mailing list [here](https://bit.ly/soma-signup).

## Code of Conduct

This project adheres to CZI's Contributor Covenant [code of conduct](https://github.com/chanzuckerberg/.github/blob/master/CODE_OF_CONDUCT.md). By participating, you are expected to uphold this code. Please report unacceptable behavior to <opensource@chanzuckerberg.com>.

This project adheres to CZI's Contributor Covenant [code of conduct](https://github.com/chanzuckerberg/.github/blob/master/CODE_OF_CONDUCT.md).
By participating, you are expected to uphold this code.
Please report unacceptable behavior to <opensource@chanzuckerberg.com>.
12 changes: 4 additions & 8 deletions python-spec/README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,7 @@
# `somacore`: the Python version of the SOMA specification

`somacore` is a (currently in-development) Python interpretation of the
[abstract SOMA specification](https://github.com/single-cell-data/SOMA/blob/main/abstract_specification.md). Like the abstract
specification, it is still in flux and in progress, and we gladly accept
feedback. This will evolve in tandem with the abstract spec itself as well as
the [initial TileDB-based implementation](
https://github.com/single-cell-data/TileDB-SOMA).
`somacore` is a (currently in-development) Python interpretation of the [abstract SOMA specification](https://github.com/single-cell-data/SOMA/blob/main/abstract_specification.md).
Like the abstract specification, it is still in flux and in progress, and we gladly accept feedback.
This will evolve in tandem with the abstract spec itself as well as the [initial TileDB-based implementation](https://github.com/single-cell-data/TileDB-SOMA).

For more information about our development process see [this project
plan](https://docs.google.com/document/d/1e6L36SS-eazG6tHYwwnCfEfUcx_3dTFJUEE-gGxgFM4/edit).
For more information about our development process see [this project plan](https://docs.google.com/document/d/1e6L36SS-eazG6tHYwwnCfEfUcx_3dTFJUEE-gGxgFM4/edit).