Accession numbers are not present for FASTA sequ...
To see other biotech frequently asked questions,
please visit http://biotech.fyicenter.com/faq/
Accession numbers are not present for FASTA sequence files
If you parse a FASTA sequence format file with Bio::SeqIO the sequences won't have the accession number. What to do?
All the data is in the
my ($gi,$acc,$locus); (undef,$gi,undef,$acc,$locus) = split(/\|/,$seq->display_id); $seq->accession_number($acc);
Why don't we just go ahead and do this? For one, we don't make any assumptions about the format of the ID part of the sequence. Perhaps the parser code could try and detect if it is a GenBank formatted ID and go ahead and set the accession number field. It would be trivial to do, just no one has volunteered the time - put it on the Project priority list if you think it is important and better yet, volunteer the code patch!
Also see http://bioperl.org/pipermail/bioperl-l/2005-August/019579.html
Other Frequently Asked Questions