Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conditions for conservation analysis of syntenic blocks #45

Open
luciaalvarez95 opened this issue May 18, 2021 · 5 comments
Open

Conditions for conservation analysis of syntenic blocks #45

luciaalvarez95 opened this issue May 18, 2021 · 5 comments

Comments

@luciaalvarez95
Copy link

Hi!

I am analyzing the level of chromatin conformation conservation between different species, and chess looks like the perfect tool for me! I already have the synteny blocks between the species I am working with, however I am not sure on what window size or step should I use. First, I would like to replicate your paper's analysis but I don't know which conditions were used, could you help me with this? Do you have any suggestion on what parameters should I take into account when running this type of analysis? Is chess sensible enough to address TAD variation in interspecies analysis?

Many thanks

Lucía

@liz-is
Copy link
Collaborator

liz-is commented May 19, 2021

Hi Lucía,

The way the comparison across species was carried out in the paper was by using the syntenic regions in bedpe format as the input pairs file for chess sim. If you have syntenic regions, you don't need to choose a window size or step to generate a pairs file - this is primarily used for comparing different conditions with the same genome.

Using the syntenic regions as the region pairs will compare the whole syntenic region in one species to the other species, so if your syntenic regions are very large I suppose you might want to create smaller regions that tile across them to perform higher-resolution analysis. I haven't used CHESS for cross-species comparison myself, but I believe @nickmachnik did that part of the analysis for the paper, so he may be able to help if you need more input.

@luciaalvarez95
Copy link
Author

Hi,

Many thanks for your helpful reply, I wasn't aware that I can use the syntenic regions as a pairs file, I will definitely try that. However, as you have pointed out, the syntenic regions are quite large and I would like to perform a higher-resolution analysis, so any more help would be welcome.

Thanks again,

Lucía

@liz-is
Copy link
Collaborator

liz-is commented May 25, 2021

Since the syntenic regions likely have different sizes in the different species and the change in size may not be even across the region, I would probably start by taking the syntenic regions in one species as a reference, splitting these into sub-regions (by tiling across them), and then lifting over these sub-regions to the other genome to find the matching syntenic sub-region to use for constructing the pairs file. You would need to write custom code to create these sub-regions, there isn't anything built-in to CHESS for this.

Note that the size of the regions you use for CHESS analysis needs to be at least 20x the resolution of your Hi-C matrix, and personally I usually have better results using 100x the resolution. That is, for a 5 kb resolution Hi-C matrix, 20x would be 100 kb regions and 100x would be 500 kb regions. So bear this in mind when deciding what size to use for your sub-regions.

@zy041225
Copy link

Hi both,

I'm also interested in performing a cross-species comparison with chess. I wonder if the whole syntenic blocks can be used in chess, or I should splice each of the blocks into sliding-windowing similar as produced by chess pair with some specific window size and step size. Besides, if the size of syntenic blocks differs quite a lot (e.g. 5 to 10-fold), would that affect the chess result?

Many thanks

Yang

@nickmachnik
Copy link
Collaborator

Hi Yang,

Generally, you should be able to use whole syntenic blocks without splitting.
We recommend matrix sizes of at least 100x100 pixels for meaningful comparisons, so if some of your smallest syntenic regions are below that size you might want to discard them or try a smaller bin size if your sequencing depth allows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants