-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: merge vardict and tnscope #1475
Open
mathiasbio
wants to merge
201
commits into
cnvkit_to_gens
Choose a base branch
from
merge_vardict_tnscope
base: cnvkit_to_gens
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
+1,907
−1,309
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Regarding the extra variants in the tumor only TGA cases I have emailed our myeloid customer to ask for feedback as they would be the ones most affected by this change. |
Quality Gate passedIssues Measures |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR is a branch of:
cnvkit_to_gens: #1448
-->
Which is a branch of:
update_cnvkit_pons: #1465
-->
Which is a branch of:
deduplicate_with_umi #1358
Part of the development work in this PR was originally made here: #1429 and which was partially reviewed by Vadym.
All upstream branches affect the quality of the analysis, and the full extent of these effects will be assessed in this PR in a sort of mini-validation.
The original plan for release 16.0.0 of balsamic was to replace VarDict with TNscope, and while all the tests were passing and the evaluation showed the changes to be an improvement in the analysis overall, the testing also revealed some confusing and unstable behaviour of TNscope which led to us deciding to keep VarDict and merge the TNscope results.
This PR in particular
This PR merges TNscope and VarDict results for the TGA workflows, and cleans up the snakemake rules a bit in general, such as removing the rules for TNhaplotyper which has not been in use for a long time.
For now it also removes the ML-model in TNscope since there is currently no available ML model supported for the new version of Sentieon. They are working on making a new model.
I also added a new filter to VarDict for allowing 30% of tumor in normal contamination to align with what we're doing for TNscope.
Up to date graph of a TGA UMI T+N workflow below:
Changed
Added
Removed
Documentation
Tests
Feature Tests
Google sheet with results here: https://docs.google.com/spreadsheets/d/13qjetgWKu9rD3hxfTfL6NNv9R_KkXsIOuqntBft6JvM/edit?usp=sharing
Summary of results in google sheet
coverage
number of variants
There are some differences in the number of variants between this workflow and the previous.
Despite merging the vardict and TNscope variant calls, the tumor normal cases have fewer variants this release than the previous. In the validation the variants that are filtered out will be assessed in more detail, but for now it can be noted that in the TWIST pancancer reference samples (which are tumor and normal cases) the sensitivity was improved compared to last version, suggesting that the variants that are filtered out are artefacts. The possible reasons to explain it are the UMI consensus collapse performing some degree of error correction, but the primary reason is probably the new tumor normal filter added to the TGA analysis to match the one already implemented in WGS (removing variants if they had 30% of tumor presence in normal).
From earlier version of testing:
After adding new TNscope option: --trim-soft-clip
I will inform the customers that commonly order myeloid analysis to get their opinion on the increase in number of variants.
horizon FLT3 variants
horizon SNV and InDels
TWIST pancancer reference samples
SeraCare variants
Myeloid variants
CNV profile in GENS
Is the CNV profile visualisation in GENS working?
CNV profile links here
CNV profile in GENS with and without using PON
5 PONS have been created in this upstream PR and which has not been fully evaluated:
update_cnvkit_pons: #1465
However, as we do not possess a set of cases with known CNVs, and have not validated the CNV analysis in the past. And even regardless of these cases (which would be very nice) the best way to evaluate the PONs is still to study the CNV profile by eye and determine if the profile is easier to interpret with the PON or without.
To save time I would suggest that we do this evaluation in the validation itself, which is uniquely possible for this feature specifically as adding a PON does not impact the code itself. If a PON would be deemed to produce noisy results we can simply not add it to the reference directory.
Feature Tests
Pipeline Integrity Tests
.hk
file)Clinical Genomics Stockholm
Documentation
Panel of Normal specific criteria
HOWEVER! We still need approval for 2 of the PoNs #1465
User Changes
Infrastructure Changes
Checklist
Important
Ensure that all checkboxes below are ticked before merging.
For Developers
For Reviewers
conditions where applicable, with satisfactory results.