This program aligns pairwise nucleotide sequences with affine penalty model. It divides the two sequences into substring pairs based on the pre-defined "anchor". Anchor sequences may be generated by pattern searching or from prior knowledge. This program applies different combination of gap penalty, Needleman–Wunsch algorithm (global alignment)/ Smith-Waterman algorithm (local alignment) to different substring pairs.
Scheme:
- Anchor sequence : global alignment , anchor penalty score
- Sequence between two anchors: global alignment , general penalty score
- Hanging seqeunce (sequence before the first anchor and sequence after the last anchor): local alignment, general penalty score
PS: Requirement: Python >= 3.7.1
python3 alignment_with_anchor.py -f ./example/test2.fa -a ./example/anchor.tsv -ap -1 -am 1 -ago -3 -age -2
usage: alignment_with_anchor.py [-h] -f FA -a ANCHOR [-am A_MATCH]
[-ap A_MISMATCH] [-ago A_GAPOPEN]
[-age A_GAPEXTEND] [-m MATCH] [-p MISMATCH]
[-go GAPOPEN] [-ge GAPEXTEND]
optional arguments:
-h, --help show this help message and exit
-f FA, --fasta FA fasta file contains 2 sequences
-a ANCHOR, --anchor ANCHOR
anchor file (BED format)
-am A_MATCH, --anchor_match A_MATCH
default = 2
-ap A_MISMATCH, --anchor_mismatch A_MISMATCH
default = -3
-ago A_GAPOPEN, --anchor_gapopen A_GAPOPEN
default = -4
-age A_GAPEXTEND, --anchor_gapextend A_GAPEXTEND
default = -3
-m MATCH, --match MATCH
default = 2
-p MISMATCH, --mismatch MISMATCH
default = -3
-go GAPOPEN, --gapopen GAPOPEN
default = -3
-ge GAPEXTEND, --gapextend GAPEXTEND
default = -1