Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[User Story] Add adapter trimming to the UMI workflow #1440

Open
3 tasks
mathiasbio opened this issue May 28, 2024 · 1 comment · May be fixed by #1358
Open
3 tasks

[User Story] Add adapter trimming to the UMI workflow #1440

mathiasbio opened this issue May 28, 2024 · 1 comment · May be fixed by #1358
Assignees
Labels
User-Story A User-Story describing new functionality
Milestone

Comments

@mathiasbio
Copy link
Contributor

Need

As a clinician I want high quality analysis, without false positive variant calls. At the moment in the UMI workflow we are not doing any quality or adapter-trimming. Not doing quality-trimming is ok as we are currently requiring at least 3 reads for error-correction which should reduce the need of quality-trimming. But the adapters may still have high quality and not be removed, and then be maintained and reduce the quality of downstream analysis. Therefore the adapters should be trimmed out.

Suggested approach

Probably add fastp adapter-trimming to the UMI workflow after concatenation, before UMI extraction, to also account for the possibility of adapter-dimers pushing the position of the UMI sequence further up the read.

Considered alternatives

No response

Deviation

No response

System requirements assessed

  • Yes, I have reviewed the system requirements

Requirements affected by this story

No response

Risk assessment needed

  • Needed
  • Not needed

Risk assessment

No response

SOUPs

No response

Can be closed when

No response

Blockers

No response

Anything else?

No response

@mathiasbio mathiasbio added the User-Story A User-Story describing new functionality label May 28, 2024
@mathiasbio mathiasbio self-assigned this May 28, 2024
@mathiasbio mathiasbio added this to the Release 16 milestone May 28, 2024
@mathiasbio mathiasbio linked a pull request Jun 20, 2024 that will close this issue
56 tasks
@mathiasbio
Copy link
Contributor Author

These features have now been added in the PR: https://github.com/Clinical-Genomics/BALSAMIC/pull/1358/files

Here the UMI workflow is refactored to share the same upstream pre-processing as the TGA workflow (which will now use UMIs).

This means the following will be shared between the two workflows:

  1. Concatenate reads
  2. Trip adapters
  3. Extract UMIs
  4. Quality trim reads
  5. Align reads with UMIs in read-header
    Then they diverge into separate workflows where TGA only uses UMIs for dedup and the UMI-workflow requires minimum of 3,1,1 UMI-groups.

See graph:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
User-Story A User-Story describing new functionality
Projects
Status: In Testing
Development

Successfully merging a pull request may close this issue.

1 participant