Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue for VCF conversion when two translocations start from the same position #212

Open
nvnieuwk opened this issue Feb 20, 2024 · 2 comments

Comments

@nvnieuwk
Copy link
Contributor

Hi, I have a VCF that contains two translocations that start at the same position in the genome. This causes some issues when trying to convert the TSV to VCF with variantconvert because both of these translocations have the same ID. Following error is created:

$ cat 20240220_AnnotSV/annot_test.variantconvert.log

python3 /usr/local/share/python3/variantconvert//variantconvert convert -i 20240220_AnnotSV/annot_test.tsv -o 20240220_AnnotSV/annot_test.vcf -fi annotsv -fo vcf -c /usr/local/share/python3/variantconvert//configs/GRCh38/annotsv3_from_vcf.json

2024-02-20 10:25:52 [INFO] running variantconvert 1.2.2
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/usr/local/share/python3/variantconvert//variantconvert/__main__.py", line 226, in <module>
    main()
  File "/usr/local/share/python3/variantconvert//variantconvert/__main__.py", line 209, in main
    main_convert(args)
  File "/usr/local/share/python3/variantconvert//variantconvert/__main__.py", line 74, in main_convert
    converter.convert(args.inputFile, args.outputFile)
  File "/usr/local/share/python3/variantconvert//variantconvert/converters/vcf_from_annotsv.py", line 463, in convert
    info_dic = self._build_info_dic()
               ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/share/python3/variantconvert//variantconvert/converters/vcf_from_annotsv.py", line 219, in _build_info_dic
    merged_annots = self._merge_full_and_split(df_variant)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/share/python3/variantconvert//variantconvert/converters/vcf_from_annotsv.py", line 171, in _merge_full_and_split
    raise ValueError(
ValueError: Each variant is assumed to only have one single line of 'full' annotation

This error has been really confusing since these were different variants. I think this can be easily solved by adding some unique identifier at the end of the ID (A number stating which variant this is should be enough to make all IDs unique). Does this sound like a good/feasible solution to you?

You can find my full logs and input VCF below this issue and reproduce it using the following command:

AnnotSV -outputFile annot_test.tsv -SVinputFile annotsv_issue.vcf -vcf 1

Thanks!
-Nicolas

Input VCF:
annotsv_issue.vcf.gz

Output folder:
20240220_AnnotSV.tar.gz

@lgmgeo
Copy link
Owner

lgmgeo commented Feb 20, 2024

Hi,

AnnotSV considers that a translocation consists of a pair of 2 breakends.
With <TRA> angle-bracketed notation, AnnotSV returns only 1 full annotation for the breakend of the pair described with "#CHROM/POS/ALT" (to be improved).

Here are your 2 TRA:

#CHROM  POS             ID                     REF       ALT   QUAL     FILTER   INFO                                                                           FORMAT          sample1
chrX    83731873        0_delly_TRA_35224       T       <TRA>   46      LowQual CHR2=chr1;CIEND=-563,563;CIPOS=-563,563;END=102454994;SVLEN=1;SVTYPE=TRA        GT:PE:SR        0/1:7,2:0,0
chrX    83731873        0_delly_TRA_35225       T       <TRA>   69      LowQual CHR2=chr1;CIEND=-563,563;CIPOS=-563,563;END=111581087;SVLEN=1;SVTYPE=TRA        GT:PE:SR        0/1:6,3:0,0
  • AnnotSV annotates only the chrX:83731873 breakend (2 times, but it is the same breakend).
    => X_83731310_83732436_TRA_1 (CIEND=-563,563;CIPOS=-563,563)
  • AnnotSV does not annotate the chr1:102454994 and chr1:111581087 breakends.

I keep in mind:

  • to make all IDs unique
  • to add the parsing of CHR2=chr1;END=102454994 with angle-bracketed notation (currently, AnnotSV does not annotate this breakend).
    Unfortunately, it would not be possible in a near future.

Best,

Véronique

@nvnieuwk
Copy link
Contributor Author

Thank you for the thorough answer, I'll try to find a workaround in the meantime

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants