Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: sort alignments by read name #159

Open
balajtimate opened this issue Jan 18, 2024 · 0 comments
Open

feat: sort alignments by read name #159

balajtimate opened this issue Jan 18, 2024 · 0 comments
Labels
future Will not be worked on for now low_priority Not urgent

Comments

@balajtimate
Copy link
Contributor

Is your feature request related to a problem? Please describe.
The output SAM files from STAR in mapping.py contain the aligned reads in the same order as the input FASTQ files. This could potentially be an issue, when for the library_type inference two samples are aligned separately, and the inputs are unsorted/sorted in different ways, as the output alignments cannot be compared due to different read order. Currently, we make the assumption that the inputs are sorted either by read name or by coordinates, but it would actually be beneficial to sort the output of STAR.

Describe the solution you'd like
Sort the aligned reads according to read name. This could either be done in mapping.py right after the alignment step, or in get_library_type.py, as the sorted, separately aligned files are only needed for the comparison here to calculate concordant pairs. Use pysam with the -n argument to create sorted BAM files.

@balajtimate balajtimate added low_priority Not urgent future Will not be worked on for now labels Jan 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
future Will not be worked on for now low_priority Not urgent
Projects
None yet
Development

No branches or pull requests

1 participant