DBNascent_build

This repository is intended for building, updating, and querying DBNascent. This is a MySQL database cataloguing all nascent sequencing experiments in the SRA through 2020. The database has been built and maintained by the DnA Lab at University of Colorado Boulder.

Data in the database pulls from manually curated metadata tables, quality control data, and bidirectional call data from samples. All data is present on the Fiji cluster at CU Boulder.

Version 1.2

Version notes (12/20/2023):

The database has been somewhat restructured.
All table names are different but describe the same fields. The table equivalents are as follows (linkIDs and searchEquiv are the same):

Old table	New table
`sampleAccum`	`samples`
`exptMetadata`	`papers`
`sampleID`	`sampleEquiv`
`geneticInfo`	`genetics`
`organismInfo`	`organisms`
`tissueDetails`	`tissues`
`bidirSummary`	`bidirs`
`conditionInfo`	`conditions`
`sampleCondition`	`conditionLink`
`nascentflowMetadata`	`nascentflowRuns`
`sampleNascentflow`	`nascentflowLink`
`bidirflowMetadata`	`bidirflowRuns`
`sampleBidirflow`	`bidirflowLink`

A few fields have changed names. The primary key identifiers for all tables are now simply id instead of naming which id it is, whereas tables that link to that id have the field as <linkedTable>_id (see fields and linkages in schema). This helps with django's navigation of the database. Other new field names are as follows:

Old field	New field
`paper_id`	`paper_name`
`samp_qc_score`	`sample_qc_score`
`samp_data_score`	`sample_nro_score`
`paper_data_score`	`paper_nro_score`

All non-integer identifier table linkages have been removed, so paper_name and sample_name are no longer in LinkIDs and organisms is linked to the papers and genetics tables by a numeric id instead of the organism name. Similarly with the sampleEquiv linkage to the samples table.

Dependencies

The database was built with python 3.6.3. The following packages are required for building OR querying:

configparser v5.2.0 or higher
numpy v1.19.2 or higher
yaml v5.4.1 or higher
pymysql v1.0.2 or higher (may substitute a different MySQL translator)
sqlalchemy v1.4.31 or higher

Database schema

(Generated with https://github.com/sqlalchemy/sqlalchemy/wiki/SchemaDisplay)

Usage

All database objects and functions are defined in dborm.py and dbutils.py.

Building and maintaining DBNascent:

In order to seamlessly integrate with the django website querying this database, the tables should be initially created through a django migration within the website repository on Gitlab. However, the schemas specified for django are the same as those specified here, with a few additional tables generated by django. Thus the database can be created with this repository alone if necessary.

config_build.py defines file paths and fields outside of and within the database. Adding a field to a metadata table requires adding it to the config_build.py file as well.

organisms.txt, sample_cell_types.txt, and searcheq.txt are manually curated tables defining organisms, tissues, and unique values within the database. Adding data may require adding additional lines to these files.

The main scripts for building the database are db_global_add_update.py and db_paper_add_update.py, combined in the db_build_full.sbatch script.

Querying DBNascent:

The database can be queried with defined fields and filtering specifications with query_printout.py for input into DESeq2 or other applications. This script relies on the config_query.txt config file, as well as the dborm.py and dbutils.py. If the query is complex enough, it may require a manual MySQL query, which can be easily passed to the database and printed out with the manual_query_printout.py script.

Both config files refer to a credentials file that contains your credentials for accessing the database. This file should be a one-line two-column tab delimited file: <username><tab><password>

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
config		config
db_build		db_build
global_files		global_files
README.md		README.md
dbschema.png		dbschema.png
manual_query_printout.py		manual_query_printout.py
query_printout.py		query_printout.py
quickstart.ipynb		quickstart.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DBNascent_build

Version 1.2

Dependencies

Database schema

Usage

Building and maintaining DBNascent:

Querying DBNascent:

About

Releases

Packages

Contributors 2

Languages

Dowell-Lab/DBNascent-build

Folders and files

Latest commit

History

Repository files navigation

DBNascent_build

Version 1.2

Dependencies

Database schema

Usage

Building and maintaining DBNascent:

Querying DBNascent:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages