PanGEA-SNP Manual

PanGEA-SW can be used to create pairwise sequence alignments using either the normal Smith-Waterman algorithm or the homopolymer Smith-Waterman algorithm. The pairwise alignments may be used for SNP identification with PanGEA-SNP

A database sequence will be aligned with all subsequent query sequences.
(See input file below)

Minimum commands in Windows:
PanGEA-Sw -i input.txt

Minimum commands in Linux (Mono):
mono PanGEA-Sw.exe -i input.txt

Following, for sake of simplicity, only the windows commands are shown, the Linux commands have to be adjusted as shown above:.

If no output file has been specified the default output file "result.aln" is used.

Multiple input files may be specified
PanGEA-Sw -i input1.txt -i input2.txt -o alignments.aln

The penalties and the hit score may be specified
PanGEA-Sw -i input.txt -pi 11 -hit 3 -pe 3 -pmm 6

The unmodified Smith-Waterman algorithm may be used instead of the homopolymer SW algorithm, by raising the '-normalsw' flag.
PanGEA-Sw -i input.txt -normalsw

Display the help
PanGEA-Sw

Description of all parameters:

	Obligatory parameters:
-i	input file; obligatory parameter

	Optional parameters:
-o	output file; optional paramter; default: result.aln
-pi	gap introducing penalty; optional parameter; default: 11
-pe	gap extend penalty; optional parameter; default: 2
-pt	homopolymere transgression penalty; optional parameter; default: 3
-pmm	mismatch penalty; optional parameter; default: 5
-hit	score for a hit; optional parameter; default: 3
-normalsw	the normal Smith-Waterman algorithm will be used instead of the homopolymer SW optional parameter; default: off

The input file has to be a slightly modified multiple fasta file, in which two types of sequences are distinguised: database sequences and query sequences. The fasta identifier of the database sequences has to be '>>' whereas for the query sequences the common fasta identifier '>' is used. A database sequence will be aligned with all of the following query sequences until a new database sequence is encountered. download example input

Example:

>>data1
AAATTGCAATT

>query1.1
AAATT

>query1.2
TTGC

>>data2
TTCGTCTCTCTCGAAGAGAGTC

>query2.1
CGTCT

>query2.2
GAAGAG

Output is a PanGEA-BlastN pairwise alignment format. The output of PanGEA-SW may be used for SNP identification using PanGEA-SNP

Description of all parameters:

Obligatory parameters:

Optional parameters: