Subsetting annotation for protein_coding biotypes #387

kenneditodd · 2023-08-16T17:41:49Z

I want to subset my data to only look at protein_coding biotypes. I was originally just going to subset my bambu transcript quant output files. However, I then wondered if it would be better to subset the original gtf file before running prepareAnnotations() as the counts may be assigned differently. Which do you recommend?

cying111 · 2023-08-17T09:44:50Z

Hi

I think that is largely depending on your dataset. If you expect some level of expression in non-protein coding genes, it's better do the filtering afterwards as pre-filtering would affect both transcript discovery and quantification. Otherwise, you can do it before running prepareAnnotations and bambu.

Hope this clarifies your question!
Thank you
Ying

kenneditodd · 2023-08-17T16:57:39Z

@cying111 Thank you!

andredsim closed this as completed Oct 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Subsetting annotation for protein_coding biotypes #387

Subsetting annotation for protein_coding biotypes #387

kenneditodd commented Aug 16, 2023

cying111 commented Aug 17, 2023

kenneditodd commented Aug 17, 2023

Subsetting annotation for protein_coding biotypes #387

Subsetting annotation for protein_coding biotypes #387

Comments

kenneditodd commented Aug 16, 2023

cying111 commented Aug 17, 2023

kenneditodd commented Aug 17, 2023