Collections:
Use Bio.SearchIO Module to Parse BLAST XML Result
How to Use Bio.SearchIO Module to Parse BLAST XML Result?
✍: FYIcenter.com
The Bio.SearchIO module allows to parse sequence search result from different result format.
1. Try the following code to query the "nt" database under the "blastn" program with a given DNA sequence, which is reverse translated from a protein sequence, AEPNAATNYATEAMDSLKTQAIDLISQTWPVVTTVVVAGLVIRLFKKFSSKA, from the "PF05371_seed.faa" file.
fyicenter$ python >>> from Bio.Seq import Seq >>> query = Seq( ... "gcggaaccgaacgcggcgaccaactatgcgaccgaagcgatggatagcctgaaaacccag" ... +"gcgattgatctgattagccagacctggccggtggtgaccaccgtggtggtggcgggcctg" ... +"gtgattcgcctgtttaaaaaatttagcagcaaagcg" ... ) >>> from Bio.Blast import NCBIWWW >>> result_handle = NCBIWWW.qblast("blastn", "nt", query)
2. Write the query result to a local file.
>>> with open("my_blast.xml", "w") as out_handle: ... out_handle.write(result_handle.read()) ... 138648 >>> result_handle.close()
3. Read the file back and parse the query result with Bio.SearchIO.parse() function.
>>> result_handle = open("my_blast.xml") >>> from Bio import SearchIO >>> blast_qresult = SearchIO.parse(result_handle, "blast-xml") >>> results = list(blast_qresult) >>> len(results) 1 >>> results[0] QueryResult(id='No', 50 hits)
4. Print out the query result.
>>> print(results[0]) Program: blastn (2.13.0+) Query: No (156) definition line Target: nt Hits: ---- ----- ---------------------------------------------------------- # # HSP ID + description ---- ----- ---------------------------------------------------------- 0 1 gi|9625381|ref|NC_001332.1| Enterobacteria phage I2-2,... 1 1 gi|14920|emb|X14336.1| Filamentous Bacteriophage I2-2 ... 2 2 gi|1844100906|gb|CP053326.1| Salmonella enterica subsp... 3 1 gi|2321361016|gb|CP107717.1| Xanthomonas campestris pv... 4 1 gi|2232737947|gb|CP075146.1| Xanthomonas campestris pv... 5 1 gi|2095686827|gb|CP066978.1| Xanthomonas campestris pv... 6 1 gi|1913269481|gb|CP062066.1| Xanthomonas campestris st... 7 1 gi|1864553229|gb|CP058243.1| Xanthomonas campestris pv... 8 1 gi|341934791|gb|CP002789.1| Xanthomonas campestris pv.... 9 1 gi|1860091948|gb|CP054912.1| Pantoea ananatis strain F... 10 1 gi|1086024185|emb|LT629791.1| Jiangella alkaliphila st... 11 1 gi|2129627975|gb|CP086009.1| Pantoea ananatis strain V... 12 1 gi|1057948474|gb|CP015992.1| Pseudomonas sp. TCU-HL1, ... 13 1 gi|984699415|gb|CP014207.1| Pantoea ananatis strain R1... 14 1 gi|354986417|gb|CP003085.1| Pantoea ananatis PA13, com... 15 1 gi|1858692958|gb|CP054803.1| Acinetobacter lwoffii str... 16 1 gi|1712751337|gb|CP036319.1| Crateriforma conspicua st... 17 1 gi|2317627232|gb|CP083759.1| Acinetobacter pseudolwoff... 18 1 gi|2215523186|gb|CP094344.1| Streptomyces sp. HP-A2021... 19 1 gi|2086772708|gb|CP080636.1| Acinetobacter lwoffii str... 20 1 gi|1482407573|gb|CP032427.1| Streptomyces griseorubigi... 21 1 gi|1129998368|ref|XM_019858453.1| PREDICTED: Hippocamp... 22 1 gi|1085627832|emb|LT629688.1| Auraticoccus monumenti s... 23 1 gi|1052266331|emb|LT607411.1| Micromonospora viridifac... 24 1 gi|1033861152|gb|CP015876.1| Pseudomonas putida SJTE-1... 25 1 gi|952467264|gb|CP013129.1| Streptomyces venezuelae st... 26 1 gi|941153505|gb|CP007213.1| Burkholderia plantarii str... 27 1 gi|932864506|emb|LN881739.1| Streptomyces venezuelae g... 28 1 gi|2364306959|ref|XM_052380290.1| PREDICTED: Dreissena... 29 1 gi|2317440212|emb|OX346715.1| Hemistola chrysoprasaria... ~~~ 47 1 gi|2089568073|dbj|AP024650.1| Arthrobacter sp. StoSoil... 48 1 gi|2070123745|gb|CP079095.1| Methylococcus sp. Mc7 chr... 49 1 gi|1893325938|gb|CP049017.1| Xanthomonas theicola stra...
5. Review a single hit.
>>> result = results[0] >>> hits = result.hits >>> len(hits) 50 >>> hit Hit(id='gi|9625381|ref|NC_001332.1|', query_id='No', 1 hsps) >>> print(hit) Query: No definition line Hit: gi|9625381|ref|NC_001332.1| (6744) Enterobacteria phage I2-2, complete genome HSPs: ---- -------- --------- ------ --------------- --------------------- # E-value Bit score Span Query range Hit range ---- -------- --------- ------ --------------- --------------------- 0 1.8e-11 83.34 128 [15:143] [4760:4888]
6. Review a HSP (alignment) in a hit.
>>> hsps = hit.hsps >>> hsp = hsps[0] >>> print(hsp) Query: No definition line Hit: gi|9625381|ref|NC_001332.1| Enterobacteria phage I2-2, complete ... Query range: [15:143] (1) Hit range: [4760:4888] (1) Quick stats: evalue 1.8e-11; bitscore 83.34 Fragments: 1 (128 columns) Query - GCGACCAACTATGCGACCGAAGCGATGGATAGCCTGAAAACCCAGGCGATTGATCTGAT~~~AAATT || |||| ||| || || ||||| ||| | ||||||||||| ||||| | |||||| ||~~~||||| Hit - GCTACCAGCTACGCTACTGAAGCAATGAACAGCCTGAAAACTCAGGCAACTGATCTCAT~~~AAATT
⇒ Parse PDB Entry with Bio.PDB.MMCIFParser.parser Module
⇐ Fetch Sequences from NCBI with Bio.Blast.NCBIWWW.qblast()
2023-05-09, 318🔥, 0💬
Popular Posts:
Molecule Summary: ID: FYI-1000279 SMILES: C[C@H](CCl)F Received at FYIcenter.com on: 2021-03-03
Molecule Summary: ID: FYI-1001888 SMILES: O=C(O)C5=C(O)C2=C(C=CC1= C(O)C=3C(=O)C4=C(O)C=C(O )C=C4(C(=O...
Molecule Summary: ID: FYI-1000959 SMILES: ZINC13377938 (Gingerenone B) COc1cc(CCC(=O)/C=C/CCc2c c(OC)...
How to install JSME on an Apache Web server? If you want to install JSME on a Web server for others ...
What Are CTfile (Chemical Table File) and CTAB (Connection Table)? CTfile (Chemical Table File) refe...