Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GroupReadsByUmi failing on one sample #53

Closed
SPPearce opened this issue May 29, 2024 · 10 comments
Closed

GroupReadsByUmi failing on one sample #53

SPPearce opened this issue May 29, 2024 · 10 comments
Labels
bug Something isn't working

Comments

@SPPearce
Copy link
Contributor

Description of the bug

This may be related to #52, but posting it separately as I'm not sure.

I’m finding this error on one of my 8 duplex samples on GroupReadsByUmi:

  [2024/05/28 06:12:25 | FgBioMain | Info] GroupReadsByUmi failed. Elapsed time: 0.06 minutes.
  Exception in thread "main" java.lang.IllegalStateException: A01659:139:HT77KDRX3:1:2160:16260:22326 did not have a primary R1 record.

which is odd to me, because that bam file contain two reads with A01659:139:HT77KDRX3:1:2160:16260:22326:

A01659:139:HT77KDRX3:1:2160:16260:22326	163	chr1	10034	60	9M1D103M1D25M7S	=	10034	139	CCCTAACCCAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACAGTACGG	FFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFF:FFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FF:FFFFF:FF:FFFFF:FFFFFFFF:FFFFF:FF::F:FFFFF,FF:F::F,,:F,,FF,,:,,,,	XA:Z:chr3,+10442,9M1D5M1D43M1I71M15S,4;chr4,-190122667,11S35M4I20M1D74M,7;chr1,-248946041,7S16M1D77M1D12M1D32M,7;	MC:Z:7S9M1D103M1D25M	MD:Z:9^T103^C25	NM:i:2	MQ:i:60	AS:i:123	XS:i:102	RX:Z:CAGTA-AATGC	RG:Z:A
A01659:139:HT77KDRX3:2:2214:9299:21261	163	chr1	16440	41	144M	=	16440	144	TCTACAGTTTGAAAACCACTATTTTATGAACCAAGTAGAACAAGATATTTGAAATCGAAACTATTCAAAAAATTGAGAATTTCTGACCACTTAACAAACCCACAGAAAATCCACCCGAGTGCACTGAGCACGCCAGAAATCAGG	FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FF	XA:Z:chr9,+16551,144M,2;chr16,+16123,144M,2;chr2,-113596853,144M,2;chr15,-101974377,144M,2;chr12,+16555,144M,2;chrX,-156023484,144M,3;chr1,+186962,144M,4;chr12_GL877875v1_alt,+6555,144M,2;	MC:Z:144M	MD:Z:55G88	NM:i:1	MQ:i:41	AS:i:139	XS:i:134	RX:Z:TGTGC-AAGGA	RG:Z:A-6B738825
A01659:139:HT77KDRX3:1:2160:16260:22326	83	chr1	10034	60	7S9M1D103M1D25M	=	10034	-139	TATGCCTCCCTAACCCAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC	,:F:,,FFFFFFFFF,FF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF	XA:Z:chr3,-10508,7S137M,7;chr4,+190122662,4M1D35M4I20M1D74M7S,8;chr3,-10458,21S43M1I71M8S,2;chr1,+248946041,16M1D77M1D12M1D26M13S,5;	MC:Z:9M1D103M1D25M7S	MD:Z:9^T103^C25	NM:i:2	MQ:i:60	AS:i:123	XS:i:105	RX:Z:CAGTA-AATGC	RG:Z:A

All 8 of these samples were sequenced over two lanes, so they are merged together. Curiously this is the only file that fails in this way, the other 7 samples are fine.

If I manually sort the merged bam file, then GroupReadsByUmi will resort the bam file itself and then work correctly.

Command used and terminal output

No response

Relevant files

No response

System information

Running fastquorum v1.0.0 on Nextflow 23.10.1 with apptainer as the container engine.

@SPPearce SPPearce added the bug Something isn't working label May 29, 2024
@nh13
Copy link
Member

nh13 commented May 29, 2024

I think it’s absolutely related. One temporary fix would be to swap in fgbio SortBam for template coordinate merging for now. It isn’t as fast (not multithreaded) but would work when merging lanes. Perhaps even better as a stop gap would be just o use samtools sort, which works, to re-sort after the merge?

@SPPearce
Copy link
Contributor Author

Ok, thanks.
Currently samtools merge is being ran non-multithreaded anyway, at least the first time (process_low, and it uses task.cpus-1).

@nh13
Copy link
Member

nh13 commented May 29, 2024

I don’t think we have a merge tool in fgbio, so it’ll have to re-sort.

@SPPearce
Copy link
Contributor Author

Ok. My surprise is that it worked on 7/8 samples.
I think there is an issue with using igenomes too, but I'll dig into that on Monday.

@nh13
Copy link
Member

nh13 commented Jul 15, 2024

You're right, this is relate tot #52 and samtools/samtools#2062

@lauren-tjoeka
Copy link

Hi I'm new to nf-core workflows and I'm also encountering this bug.

[2024/07/31 08:02:03 | FgBioMain | Info] GroupReadsByUmi failed. Elapsed time: 0.05 minutes. Exception in thread "main" java.lang.IllegalStateException: A00232:194:H2LGVDSXC:1:1671:4797:18004 did not have a primary R1 record.

Could you elaborate on how to swap in 'fgbio SortBam'? Is this something I can specify in my config file?

I think it’s absolutely related. One temporary fix would be to swap in fgbio SortBam for template coordinate merging for now. It isn’t as fast (not multithreaded) but would work when merging lanes. Perhaps even better as a stop gap would be just o use samtools sort, which works, to re-sort after the merge?

Thanks!

@SPPearce
Copy link
Contributor Author

This was fixed in the branch that @nh13 had made, but he seems to have deleted it now.
The upstream fix is in samtools, but samtools haven't made a release yet.
Nils, I think we should release a 1.0.1 version sooner than samtools might actually get round to releasing it.

@SPPearce
Copy link
Contributor Author

Could you elaborate on how to swap in 'fgbio SortBam'? Is this something I can specify in my config file?

No, you can't do this with a config, it requires an edit to the workflow itself.

@nh13
Copy link
Member

nh13 commented Aug 1, 2024

@SPPearce here's the closed PR: #54. I was hoping that samtools would be released by now, but its volunteer so I can relate. I've asked for a release from here: samtools/samtools#2090. perhaps we wait a few days and then do a release?

@SPPearce
Copy link
Contributor Author

SPPearce commented Aug 3, 2024

Fixed in #68

@SPPearce SPPearce closed this as completed Aug 3, 2024
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants