Collections:
Single Sequence Record in FASTA Format
How to read a Single Sequence Record in FASTA Format?
✍: FYIcenter.com
If you want to store additional information to a DNA or protein
sequence, you can use the Bio.SeqRecord class, which contains
the following properties:
We can download an example of a Sequence Record in FASTA Format.
fyicenter$ wget https://raw.githubusercontent.com/biopython/biopython/master/Tests/GenBank/NC_005816.fna -rw-r--r--. 1 fyicenter staff 9853 Jan 27 23:55 NC_005816.fna
Then we can create a Bio.SeqRecord object with the SeqIO.read() function.
fyicenter$ python
>>> from Bio import SeqIO
>>> record = SeqIO.read("NC_005816.fna", "fasta")
>>> print(record)
ID: gi|45478711|ref|NC_005816.1|
Name: gi|45478711|ref|NC_005816.1|
Description: gi|45478711|ref|NC_005816.1| Yersinia pestis biovar Microtus str. 91001 plasmid pPCP1, complete sequence
Number of features: 0
Seq('TGTAACGAACGGTGCAATAGTGATCCACACCCAACGCCTGAAATCAGATCCAGG...CTG')
As you can see, the FASTA format does not provide enough properties and a good structure for Biopython to parse from.
⇒ Single Sequence Record in GenBank Format
2023-04-04, 927🔥, 0💬
Popular Posts:
Molecule Summary: ID: FYI-1004028 Names: InChIKey: TXCXZVFDWQYTIC-UHFFFAOYS A-NSMILES: Sc2nnc(c1ccnc...
Molecule Summary: ID: FYI-1005788 Names: InChIKey: DQJQIZXJXVRRSY-ONEGZZNKS A-NSMILES: CC/C=C/CCCCCC...
Molecule Summary: ID: FYI-1006731 Names: InChIKey: WBRFFCOFWKCXFD-YWPYICTPS A-NSMILES: O=C4NC(=O)[C@...
Molecule Summary: ID: FYI-1004042 Names: InChIKey: KSSVWDCJAMBHLJ-UHFFFAOYS A-NSMILES: O=C(O)CSc2nnc...
Molecule Summary: ID: FYI-1000303 SMILES: O=[Al]O[Al]=O.C1=CC2=CC3 =CC=CC=C3C=C2C=C1Received at FYIc...