Bgee is a database that integrates gene expression data from both microarray and RNA-seq experiments. The Bgee curators annotate samples by their species, anatomical structure, and developmental stage. Bgee leverages anatomical and developmental ontologies to call whether a gene is present or absent and under/over-expressed in a given condition.
Here, we process Bgee to generate gene expression profiles for human anatomical structures (tissues). We extract two gene expression measures:
- gene presence — whether a gene is expressed or not in a given anatomy for adult humans. See the results at
data/present-in-adult.tsv.gz
. - differential expression — whether a gene is under or over-expressed in a given anatomy for post-juvenile adults. See the results at
data/diffex.tsv.gz
.
See our Thinklab project and discussion on processing Bgee for additional context.
Execute download.sh
from the download
directory to retrieve the raw Bgee downloads.
Then the notebooks are executed in the following order:
developmental-stages.ipynb
— extract a table of developmental stages (data/stages.tsv
). This table is used to filter for adult stages.bgee.ipynb
— process raw Bgee data and extract gene presence and differential expression datasets.
All original content in this repository is released under CC0 1.0. Please refer to Bgee for the licensing and reuse of their data.