Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subsetting annotation for protein_coding biotypes #387

Closed
kenneditodd opened this issue Aug 16, 2023 · 2 comments
Closed

Subsetting annotation for protein_coding biotypes #387

kenneditodd opened this issue Aug 16, 2023 · 2 comments

Comments

@kenneditodd
Copy link

I want to subset my data to only look at protein_coding biotypes. I was originally just going to subset my bambu transcript quant output files. However, I then wondered if it would be better to subset the original gtf file before running prepareAnnotations() as the counts may be assigned differently. Which do you recommend?

@cying111
Copy link
Collaborator

Hi

I think that is largely depending on your dataset. If you expect some level of expression in non-protein coding genes, it's better do the filtering afterwards as pre-filtering would affect both transcript discovery and quantification. Otherwise, you can do it before running prepareAnnotations and bambu.

Hope this clarifies your question!
Thank you
Ying

@kenneditodd
Copy link
Author

@cying111 Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants