 |
 |
| BioTech FYI Center - Resources |
|
|
 |
|
|
ClustalW FAQ (Frequently Asked Questions)
Part:
1
2
3
4
5
6
(Continued from previous part...)
What type of
sequences can ClustalW align?
It can align either nucleotide or protein sequences. In
the case of nucleotide sequences, it will align them as they are input - the
program does not provide the option of specifying DNA strands. The EMBOSS tool
revseq can be used to reverse and/or complement nucleotide sequences.
What
input formats does ClustalW accept?
The program accepts sequences in the following formats:
NBRF/PIR, EMBL/UniProt, Pearson (Fasta), GDE,
ALN/ClustalW, GCG/MSF, RSF (see the Clustal help pages for details about
formats).
The sequences can either be pasted into the web form or
uploaded to the web form in a file. It is very important that each of the
sequences has a unique name. If they do not, the program will fail. There must
be no empty lines, white spaces or control characters between sequences or at
the top of the file. This will also cause the program to fail.
What
output formats does ClustalW produce?
There are a number of options provided as output for the
user:
aln with numbers (default), aln without numbers, gcg
MSF, phylip, pir and gde.
The user can specify
which of these they want on the web form in the OUTPUT section. There is also
an option to specify the order that the sequences appear in the alignment:
aligned (default) or in the order in which they were input. The alignment will
appear on the results page along with details of scores and guide trees. The
alignment can be obtained on its own by clicking on the alignment file option
at the top (.aln). This file can be opened in a separate window and/or saved to
a file.
How
can I save my alignment to a file?
The alignment will appear on the results page along with
details of scores and guide trees. The alignment can be obtained on its own by
clicking on the alignment file option at the top (.aln). This file can be
opened in a separate window or saved to a file.
Is there a limit on the number of sequences or the size of
the file that I submit to ClustalW?
The input for ClustalW is limited to a maximum of 500
sequences or to a 10MB file (whichever is smaller). When the input file or the
number of sequences is large, ClustalW can run for days and in some cases may
not finish at all. If you plan to input large amounts of data/sequences, you
should use the "RESULTS: email" option and "CPU MODE:
multiple".
Email jobs are allowed to run for more than 24 hours and
the results are kept for a week.
What do the file extensions
mean that I get in my results?
On
our ClustalW submission page, when you submit a number of sequences using the
default parameters, you retrieve a .aln and a .dnd file. The .aln file is the
alignment and the .dnd file is a guide tree - it is not a phylogenetic tree.
To
get an accurate phylogenetic tree, you need to use the .aln file as input and
put this back into the ClustalW form. This time you need to choose one of the
tree options - nj, phylip or dist (all methods for making phylogenetic trees).
This time you will retrieve a .ph (always), .dst and/or .nj (depending on
options), which are phylogenetic trees.
The
.input is your input and the .output is the results that are output.
(Continued on next part...)
Part:
1
2
3
4
5
6
|
|
 |
| ClustalW FAQ (Frequently Asked Questions) |
|
|
|
|
 |
 |