2024 Physalia Adaptation Genomics Course

Welcome 👋

This GitHub page includes scripts, input data, and images associated with the practical sessions of the 2024 Physalia Course on Adaptation Genomics, given by Mafalda Ferreira and Angela Fuentes Pardo.

These materials correspond to modified versions of the original files developed (and generously shared) by Anna Tigano, Yann Dorant and Claire Mérot, which are available here.

All tutorials (except for day 1) can be completed using the files provided in this GitHub page. Therefore, each tutorial can be run independently, ensuring that everyone can start fresh every day (even if they were unable to complete a previous practical session).

Before the course

Install required software

Some exercises will be run using the cloud compute service AWS ("on the server"), and others will be run on your local computer. Thus, please make sure you have installed on your computer the software listed below before the course begins:

R
RStudio
R packages listed here
FileZilla

For Windows users:

MobaXterm

(Optional) Refresher on Unix and R

A prerequisite of the course is that you are familiar with Unix and R. If you think you need a quick refresher of any of them, please take a look at the suggested readings available here.

During the course

Schedule

Below you can find the proposed schedule for the week. We will maintain some flexibility in the schedule to allow enough time for questions and discussions.

Log in to the AWS server from your computer

Please follow the instructions shared by Carlo.

Tutorials

Visual overview

Day 1: Handling NGS data, from raw reads to SNPs matrix

Data: All exercises will be based on the dataset from Cayuela et al. (2020), Molecular Ecology.
Genome assembly: For this course, we generated a dummy assembly of about 90 MB (instead of about 500 MB) and 5 chromosomes (instead of 24) to expedite analysis running time.
Raw data: Data were generated using a reduced-representation approach (GBS/RADseq) and sequenced with IonTorrent.

OBS! The analyses we will learn during the course are scalable to whole genome resequencing data or other type of genomic data.

1-1: Getting familiar with Unix environment

1-2: From raw sequences to mapped reads

1-3: Calling variants with Stacks

Day 2: Population structure and confounding factors

2-1: F_ST statistics with vcftools (optional: with Stacks, optional: Pairwise-F_ST and Isolation-by-Distance)

2-2: Principal component analysis (PCA)

2-3: Population clustering with LEA

2-4: Discriminant Analysis of Principal Components (DAPC)

Day 3: Outlier detection and Genome-by-Environment associations

Data: We focus on 12 populations from Canada for which there is almost no geographic structure but great environmental variability.

3-1: Genetic structure and LD-pruning

3-2: Outlier of differentiation with two methods (Outflank & BayPass)

3-3: Genotype-Environnement Associations with two methods (Baypass & Redundancy Analysis)

Day 4: Accounting for Structural Variants

Data: We focus on 12 population from Canada. We recommend that you pick one of the two tutorials (haploblocks by local PCA or CNVs from RAD-seq data)

4-1: Investigating haplotypes blocks (~inversions?)

This tutorial include working on local PCA, but also calculation of LD, F_ST and observed fraction of heterozygotes which may be useful in other contexts.

4-2: SV calling

Day 5: Functional approaches

5-1: SnpEff annotation of SNPs for coding and regulatory regions

5-2: Intersection between SNPs and genes with bedtools

5-3: Gene ontology enrichment

5-4: (Optional) Intersection between CNVs and repeats/TE

Additional resources

Cheat sheet of basic Unix commands.

Cheat sheet of basic R commands.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
00_before_the_course		00_before_the_course
00_documents		00_documents
01_day1		01_day1
02_day2		02_day2
03_day3		03_day3
04_day4		04_day4
05_day5		05_day5
images		images
lectures		lectures
.DS_Store		.DS_Store
.gitignore		.gitignore
Connection_to_the_Amazon_EC2_service_2024.pdf		Connection_to_the_Amazon_EC2_service_2024.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

2024 Physalia Adaptation Genomics Course

Welcome 👋

Table of contents

Before the course

Install required software

(Optional) Refresher on Unix and R

During the course

Schedule

Log in to the AWS server from your computer

Tutorials

Visual overview

Day 1: Handling NGS data, from raw reads to SNPs matrix

Day 2: Population structure and confounding factors

Day 3: Outlier detection and Genome-by-Environment associations

Day 4: Accounting for Structural Variants

Day 5: Functional approaches

Additional resources

About

Releases

Packages

Contributors 2

Languages

MafaldaSFerreira/physalia_adaptation_course-2024

Folders and files

Latest commit

History

Repository files navigation

2024 Physalia Adaptation Genomics Course

Welcome 👋

Table of contents

Before the course

Install required software

(Optional) Refresher on Unix and R

During the course

Schedule

Log in to the AWS server from your computer

Tutorials

Visual overview

Day 1: Handling NGS data, from raw reads to SNPs matrix

Day 2: Population structure and confounding factors

Day 3: Outlier detection and Genome-by-Environment associations

Day 4: Accounting for Structural Variants

Day 5: Functional approaches

Additional resources

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages