Collections:
Single Sequence Record in GenBank Format
How to read a Single Sequence Record in GenBank Format?
✍: FYIcenter.com
The GenBank format for DNA or protein sequences contains more properties and a better structure that FASTA format. You can follow these steps to download GenBank file example and create a Bio.SeqRecord object.
1. Download an example of a Sequence Record in GenBank Format.
fyicenter$ wget https://raw.githubusercontent.com/biopython/biopython/master/Tests/GenBank/NC_005816.gb -rw-r--r--. 1 fyicenter staff 31838 Jan 27 23:55 NC_005816.gb
2. View the GenBank sequence file.
fyicenter$ more NC_005816.gb LOCUS NC_005816 9609 bp DNA circular BCT 21-JUL-2008 DEFINITION Yersinia pestis biovar Microtus str. 91001 plasmid pPCP1, complete sequence. ACCESSION NC_005816 VERSION NC_005816.1 GI:45478711 DBLINK Project: 58037 KEYWORDS . SOURCE Yersinia pestis biovar Microtus str. 91001 ORGANISM Yersinia pestis biovar Microtus str. 91001 Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales; Enterobacteriaceae; Yersinia. REFERENCE 1 (bases 1 to 9609) AUTHORS Zhou,D., Tong,Z., Song,Y., Han,Y., Pei,D., Pang,X., Zhai,J., Li,M., Cui,B., Qi,Z., Jin,L., Dai,R., Du,Z., Wang,J., Guo,Z., Wang,J., Huang,P. and Yang,R. TITLE Genetics of metabolic variations between Yersinia pestis biovars and the proposal of a new biovar, microtus JOURNAL J. Bacteriol. 186 (15), 5147-5152 (2004) PUBMED 15262951 REFERENCE 2 (bases 1 to 9609) AUTHORS Song,Y., Tong,Z., Wang,J., Wang,L., Guo,Z., Han,Y., Zhang,J., Pei,D., Zhou,D., Qin,H., Pang,X., Han,Y., Zhai,J., Li,M., Cui,B., Qi,Z., Jin,L., Dai,R., Chen,F., Li,S., Ye,C., Du,Z., Lin,W., Wang,J., Yu,J., Yang,H., Wang,J., Huang,P. and Yang,R. TITLE Complete genome sequence of Yersinia pestis strain 91001, an isolate avirulent to humans JOURNAL DNA Res. 11 (3), 179-197 (2004) PUBMED 15368893 REFERENCE 3 (bases 1 to 9609) CONSRTM NCBI Genome Project TITLE Direct Submission JOURNAL Submitted (16-MAR-2004) National Center for Biotechnology Information, NIH, Bethesda, MD 20894, USA REFERENCE 4 (bases 1 to 9609) AUTHORS Song,Y., Tong,Z., Wang,L., Han,Y., Zhang,J., Pei,D., Wang,J., Zhou,D., Han,Y., Pang,X., Zhai,J., Chen,F., Qin,H., Wang,J., Li,S., Guo,Z., Ye,C., Du,Z., Lin,W., Wang,J., Yu,J., Yang,H., Wang,J., Huang,P. and Yang,R. TITLE Direct Submission JOURNAL Submitted (24-APR-2003) The Institute of Microbiology and Epidemiology, Academy of Military Medical Sciences, No. 20, Dongdajie Street, Fengtai District, Beijing 100071, People's Republic of China COMMENT PROVISIONAL REFSEQ: This record has not yet been subject to final NCBI review. The reference sequence was derived from AE017046. COMPLETENESS: full length. FEATURES Location/Qualifiers source 1..9609 /organism="Yersinia pestis biovar Microtus str. 91001" /mol_type="genomic DNA" /strain="91001" /db_xref="taxon:229193" /plasmid="pPCP1" /biovar="Microtus" repeat_region 1..1954 gene 87..1109 /locus_tag="YP_pPCP01" /db_xref="GeneID:2767718" ... variation 8529^8530 /note="compared to AL109969" /replace="tt" ORIGIN 1 tgtaacgaac ggtgcaatag tgatccacac ccaacgcctg aaatcagatc cagggggtaa 61 tctgctctcc tgattcagga gagtttatgg tcacttttga gacagttatg gaaattaaaa ... 9541 aaaataaaaa tgtgacatcg caatgccaga taatattgac gcatgaggga atgcgtaccc 9601 cgacccctg //
3. Create a Bio.SeqRecord object with the SeqIO.read() function.
fyicenter$ python >>> from Bio import SeqIO >>> record = SeqIO.read("NC_005816.gb", "genbank") >>> print(record) ID: NC_005816.1 Name: NC_005816 Description: Yersinia pestis biovar Microtus str. 91001 plasmid pPCP1, complete sequence Database cross-references: Project:58037 Number of features: 41 /molecule_type=DNA /topology=circular /data_file_division=BCT /date=21-JUL-2008 /accessions=['NC_005816'] /sequence_version=1 /gi=45478711 /keywords=[''] /source=Yersinia pestis biovar Microtus str. 91001 /organism=Yersinia pestis biovar Microtus str. 91001 /taxonomy=['Bacteria', 'Proteobacteria', 'Gammaproteobacteria', 'Enterobacteriales', .... /references=[Reference(title='Genetics of metabolic variations between Yersinia ... /comment=PROVISIONAL REFSEQ: This record has not yet been subject to final NCBI review. The reference sequence was derived from AE017046. COMPLETENESS: full length. Seq('TGTAACGAACGGTGCAATAGTGATCCACACCCAACGCCTGAAATCAGATCCAGG...CTG')
4. Look at the first 3 features stored in the Bio.SeqRecord object.
>>> print(record.features[0]) type: source location: [0:9609](+) qualifiers: Key: biovar, Value: ['Microtus'] Key: db_xref, Value: ['taxon:229193'] Key: mol_type, Value: ['genomic DNA'] Key: organism, Value: ['Yersinia pestis biovar Microtus str. 91001'] Key: plasmid, Value: ['pPCP1'] Key: strain, Value: ['91001'] >>> print(record.features[1]) type: repeat_region location: [0:1954](+) qualifiers: >>> print(record.features[2]) type: gene location: [86:1109](+) qualifiers: Key: db_xref, Value: ['GeneID:2767718'] Key: locus_tag, Value: ['YP_pPCP01']
⇒ Play with the ls_orchid.fasta File
⇐ Single Sequence Record in FASTA Format
2023-04-04, 322🔥, 0💬
Popular Posts:
Molecule Summary: ID: FYI-1002032 Names: InChIKey: BDAGIHXWWSANSR-UHFFFAOYS A-NSMILES: O=CO Received...
Molecule Summary: ID: FYI-1002035 Names: InChIKey: JZRWCGZRTZMZEH-UHFFFAOYS A-NSMILES: CC1=C(SC=[N+]...
Molecule Summary: ID: FYI-1000291 SMILES: CCN(CC)CCOC(=O)C1=CC=C(C =C1)NReceived at FYIcenter.com on...
Molecule Summary: ID: FYI-1000175 SMILES: C#Cc1ccc(C(=O)CCCC(C)(C) N)cc1Received at FYIcenter.com on...
Why am getting two "svg" tag levels in SVG source code generated by the "babel" command? If you are ...