Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flip gff3 annotations to match flipped sequences #248

Closed
oushujun opened this issue May 3, 2022 · 8 comments · Fixed by #257
Closed

Flip gff3 annotations to match flipped sequences #248

oushujun opened this issue May 3, 2022 · 8 comments · Fixed by #257

Comments

@oushujun
Copy link

oushujun commented May 3, 2022

When aligned one genome to another, we found some scaffolds were in reverse orientations. We can easily reverse complement the sequences of these scaffolds, but we found "flipping" the annotations quite challenging (ie, 1-bp off issue). I searched genometools and AGAT, and neither has tools for this purpose.

I would like to reverse the start and end of annotations of a given set of sequences. For example, if the gff3 file has chr1, chr2, ..., chrN, and I would like to reverse annotations of chr1 and chr2. The script takes into a file (or a comma-separated string) designating the sequences that I would like to reverse, then reverse the annotations within these sequences

Liftoff tools may do the job, but making the task too complicated. If the chainfile is not correctly created, the resulting rematch will contain unnecessary noise.

Thank you!

Shujun

@Juke34
Copy link
Collaborator

Juke34 commented May 9, 2022

Sounds like a good task for AGAT. I would keep the selection of sequences to flip outside AGAT. You provide a fasta file and a gff nothing else. All annotations from sequences provided by the fasta file will be flipped. (The fasta is needed to know the length of the sequence)
Will see what I can do.

Juke34 pushed a commit that referenced this issue May 9, 2022
@Juke34
Copy link
Collaborator

Juke34 commented May 9, 2022

Could you give a try to bin/agat_sq_reverse_complement.pl in branch 248?

@MatthewRobbins-USDA
Copy link

@Juke34 Thank you so much for developing this tool. I have been looking for a tool that can do this, and I really appreciate that it can reverse some of the annotations and not all. I wonder, though, if this tool is just reversing the annotations or reverse complementing the annotations. As I tried it on my .gff file, I see the start and end positions are reversed, but if the sequence in the fasta file was reverse complemented, wouldn't the strand of the annotation change from + to - or from - to + (or maybe I haven't thought it through properly)? The strand remained the same in my output gff. Maybe there needs to be an option to just reverse the annotation (like it seems it is doing now) or reverse complement the annotation and change the strand?

@Juke34
Copy link
Collaborator

Juke34 commented May 18, 2022

Hi, you ae right strand of the annotation should change, I probably forgot that.

Juke34 pushed a commit that referenced this issue May 18, 2022
@MatthewRobbins-USDA
Copy link

MatthewRobbins-USDA commented May 18, 2022

Thanks for the update. However, the strand did not change when I tried the new version. I can see you added lines 104-111 to the script to capture the strand value and invert it, but maybe the new inverted value was not set like it was for the start and end position in lines 101-102? Also, I got the error # Failed test 'output bin/agat_sq_reverse_complement.pl' # at t/scripts_output.t line 651. after using the make test command after compiling the new version.

@Juke34
Copy link
Collaborator

Juke34 commented May 18, 2022

Should be fine now

@MatthewRobbins-USDA
Copy link

Thanks for the quick response. The make test error is resolved, but this version of the script puts the inverted strand in the end column.

@Juke34
Copy link
Collaborator

Juke34 commented May 19, 2022

My bad, did that to quickly... I will fix that

@Juke34 Juke34 mentioned this issue May 19, 2022
Merged
Juke34 pushed a commit that referenced this issue May 23, 2022
* new script script sq_reverse_complement - fix #248 + doc + test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants