Nickname(s) | Full citation and link | Main Takeaways/Comments | Keywords |
---|---|---|---|
The Proteomics Paper | Gillis J, Ballouz S, Pavlidis P. Bias tradeoffs in the creation and analysis of protein-protein interaction networks. Journal of proteomics. 2014;100:44-54. Epub 2014/02/01. doi: 10.1016/j.jprot.2014.01.020. PubMed PMID: 24480284; PubMed Central PMCID: PMC3972268. https://www.sciencedirect.com/science/article/pii/S1874391914000384?via%3Dihub | - Biases in PPI data due to prey/bait selection | Protein–protein interaction, Co-expression, Bias, Gene Ontology, Networks, Multifunctionality |
Wim’s First Paper | Verleyen W, Ballouz S, Gillis J. Measuring the wisdom of the crowds in network-based gene function inference. Bioinformatics. 2015;31(5):745-52. doi: 10.1093/bioinformatics/btu715. PubMed PMID: 25359890. https://academic.oup.com/bioinformatics/article/31/5/745/317877 | - Data is more important than methods | Machine learning |
The Guidance Paper (Sara’s First Paper, RNA-seq Co-expression Paper) | Ballouz S, Verleyen W, Gillis J. Guidance for RNA-seq co-expression network construction and analysis: safety in numbers. Bioinformatics. 2015. doi: 10.1093/bioinformatics/btv118. PubMed PMID: 25717192. https://academic.oup.com/bioinformatics/article/31/13/2123/196230 | - It’s important to have lots of data - Microarray coexpression and RNA-seq coexpression are similar except that low expressing genes form strong modules in microarray but not RNA-seq networks | RNA-seq, microarray, coexpression, human, replicability, network analysis |
The Goodhart Paper | Verleyen W, Ballouz S, Gillis J. Positive and negative forms of replicability in gene network analysis. Bioinformatics. 2015. doi: 10.1093/bioinformatics/btv734. PubMed PMID: 26668004. PMC Journal - In Process. https://academic.oup.com/bioinformatics/article/32/7/1065/1744280 | - Replicability can occur for uninteresting reasons (e.g. data re-use) | Machine learning, replicability, network analysis, generalization |
AuPairWise | Ballouz S, Gillis J. AuPairWise: A Method to Estimate RNA-Seq Replicability through Co-expression. PLoS computational biology. 2016;12(4):e1004868. doi: 10.1371/journal.pcbi.1004868. PubMed PMID: 27082953; PubMed Central PMCID: PMC4833304. http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004868 | - Higher coexpression of selected gene-pairs over random gene-pairs can be used for RNA-seq quality control | Software, coexpression |
EGAD | Ballouz S, Weber M, Pavlidis P, Gillis J. EGAD: ultra-fast functional analysis of gene networks. Bioinformatics. 2017 Feb 15; 33(4):612-614. PubMed PMID: 27993773. https://academic.oup.com/bioinformatics/article/33/4/612/2664343 | - Bioconductor package for neighbor voting and other assorted functions | Software, network analysis |
ErmineJ | Ballouz S, Pavlidis P, Gillis J. Using predictive specificity to determine when gene set analysis is biologically meaningful. Nucleic Acids Research. 2016. doi: 10.1093/nar/gkw957 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5389513/ | - Specificity and robustness are useful heuristics to identify reliable enrichment results. - We can use multifunctionality as a way of targeting specificity and robustness. | Enrichment analysis, GO |
The Shoichet Paper (The Ligand Paper) | O'Meara MJ, Ballouz S, Shoichet BK, Gillis J. Ligand Similarity Complements Sequence, Physical Interaction, and Co-Expression for Gene Function Prediction. PLoS One. 2016; 11(7):e0160098. PMID: 27467773. http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0160098 | - Ligand similarity contains different information than other networks. | Collaboration, coexpression, gene function |
The Single Cell Coexpression Paper (The Genome Biology Paper) | Crow M, Paul A, Ballouz S, Huang ZJ, Gillis J (2016) Exploiting single-cell expression to characterize co-expression replicability. Genome Biology 17, 101. PubMed PMID: 27165153; PubMed Central PMCID: PMC4862082. https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0964-6 | - Single cell RNA-seq coexpression aggregation ~ bulk - Coexpression within cell types ~ across cell types - Expression level can predict coexpression, so should test for this | Single cell, meta-analysis, coexpression, Brainspan, control experiments, novel data |
The Effect Size Paper (The Genome Medicine Paper) | Ballouz S, Gillis J. Strength of functional signature correlates with effect size in autism. Genome Med. 2017 Jul 7; 9(1):64. PubMed PMID: 28687074; PubMed Central PMCID: PMC5501949. https://genomemedicine.biomedcentral.com/articles/10.1186/s13073-017-0455-8 | - The more strongly a gene is associated with a disease, the more likely it is to show functional convergence. | Expression, functional enrichment, disease, genetics, autism, Brainspan |
Anirban’s Paper | Paul A, Crow M, Raudales R, He M, Gillis J, Huang ZJ. Transcriptional Architecture of Synaptic Communication Delineates GABAergic Neuron Identity. Cell. 2017; 171(3):522-539.e20. NIHMSID: NIHMS927502, PMID: 28942923, PMCID: PMC5772785 http://www.cell.com/cell/abstract/S0092-8674(17)30990-X | - Gene sets related to synaptic function show characteristic expression patterns within interneuron subtypes | Single cell, collaboration, brain, novel data |
MetaNeighbor | Crow M, Paul A, Ballouz S, Huang ZJ, Gillis J. Characterizing the replicability of cell types defined by single cell RNA-sequencing data using MetaNeighbor. Nature communications. 2018; 9(1):884. PMID: 29491377, PMCID: PMC5830442 https://www.nature.com/articles/s41467-018-03282-0 | - Cell type transcriptional profiles are replicable across studies - When predicting cell identity, almost any set of genes can be used to improve performance above chance - Highly variable genes are generally useful, even when cell types are rare or only subtly different from the outgroup | Single cell, meta-analysis, brain, software |
Aligner | Ballouz S, Dobin A, Gingeras TR, Gillis J. The fractured landscape of RNA-seq alignment: The default in our STARs. Nucleic Acids Research. https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gky325/4990636 | - Exact expression is hard to get right, statistical differences are easy - Most parameter choices are fine, but our ways of telling what is fine are overly technical. | RNA-seq, STAR, software, meta-analysis, collaboration |
Maggie's Single-cell Coexpression Opinion | Crow M, Gillis J. Co-expression in single cell analysis: Saving grace or original sin? Trends in Genetics. 2018; 34(11):823-831. PMID: 30146183, PMCID: PMC6195469 https://doi.org/10.1016/j.tig.2018.07.007 | - Single-cell RNA-seq only works because of coexpression. - At some point this will fail. | Single cell, coexpression, marker genes, causality, opinion |
The Current Opinion Piece | Crow M, Gillis J. Single cell RNA-sequencing: Replicability of cell types. Current Opinion in Neurobiology. 2019; 56, 69-77. https://doi.org/10.1016/j.conb.2018.12.002 | - What is a cell type? Transcription alone is not sufficient to establish whether a cluster has a unique function, but replicability of profiles is a good first step. | Single cell, replicability, causality |
The DE Prior paper (the PNAS paper) | Crow M, Lim N, Ballouz S, Pavlidis P, Gillis J. (2019) Predictability of human differential gene expression. PNAS. 2019. https://doi.org/10.1073/pnas.1802973116 | - Some genes are more likely to be DE than others. - Knowing this can help you interpret the plausibility and specificity of your DE hit list. | Expression, meta-analysis, collaboration, Gemma, functional enrichment |
Consensus (null) opinion | Ballouz S, Dobin A, Gillis J. (2019) Is it time to change the reference genome? Genome Biology. https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1774-4 | - The reference genome is idiosyncratic and shouldn't be used as a baseline. - Incorporating the most frequent/common allele into the reference (i.e., converting it into a 'consensus' genome) is a good-enough fix | Consensus genome, Reference genome, mapping, variant-calling, collaboration |
Nickname(s) | Full citation and link | Main Takeaways/Comments | Keywords |
---|---|---|---|
The Multifunctionality Paper | Gillis J, Pavlidis P. The impact of multifunctional genes on "guilt by association" analysis. PloS one. 2011;6(2):e17258. doi: 10.1371/journal.pone.0017258. PubMed PMID: 21364756; PubMed Central PMCID: PMC3041792. http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0017258 | - A single ranked list of genes is a good predictor for lots of gene functions (defined as sets) - This ranked list is embedded in networks via node degree - Sophisticated algorithm performance can be about half described as reconstructing this list (focusing on PPI data) | Bias, gene function, machine learning |
The Indirect Associations Paper | Gillis J, Pavlidis P. The role of indirect connections in gene networks in predicting function. Bioinformatics. 2011;27(13):1860-6. doi: 10.1093/bioinformatics/btr288. PubMed PMID: 21551147; PubMed Central PMCID: PMC3117376. https://academic.oup.com/bioinformatics/article/27/13/1860/185863 | - Algorithms look exactly like neighbor-voting if indirect connections are given some fractional value - This means very fast machine learning can be done by pre-propagating the network if sparse - Co-expression networks can be aggregated to give a high-performing dense network (no need to make it sparse) | Machine learning, coexpression, network analysis |
The Critical Connections Paper (The Exception Paper) | Gillis J, Pavlidis P. "Guilt by association" is the exception rather than the rule in gene networks. PLoS computational biology. 2012;8(3):e1002444. doi: 10.1371/journal.pcbi.1002444. PubMed PMID: 22479173; PubMed Central PMCID: PMC3315453. http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002444 | - Single “one-off” connections in PPI networks account for a lot of the performance missed by multifunctionality. These connections aren’t “learnable” in any conventional sense | Generalization, protein-protein interaction |