Read Motif in JASPAR Format with Bio.motifs

Q

How to Read Motif in JASPAR Format with Bio.motifs Module?

✍: FYIcenter.com

A

The Bio.motifs.read() function allows to read motif files in several formats including JASPAR.

1. Download motif file in JASPER format by going to https://jaspar.genereg.net/matrix/MA0080.5/ and clicking the "JASPAR" download button. You see MA0080.5.jaspar file saved on your computer.

2. Read the motif file with read() function.

fyicenter$ python
>>> from Bio import motifs

>>> handle = open("MA0080.5.jaspar")
>>> m = motifs.read(handle, "jaspar")

>>> len(m)
20

3. View motif object structure.

>>> type(m)
<class 'Bio.motifs.jaspar.Motif'>

>>> print(m)
TF name  SPI1
Matrix ID  MA0080.5
Matrix:
          0        1        2        3        4        5         6         7 ...
A: 42201.00 48240.00 54154.00 78831.00 81904.00 99739.00  15301.00 113087.00 ...
C: 22587.00 21262.00 20183.00 11424.00 12269.00  2914.00  10958.00   3425.00 ...
G: 38405.00 34277.00 37341.00 25893.00 25580.00 13479.00 100825.00  12544.00 ...
T: 30010.00 29424.00 21525.00 17055.00 13450.00 17071.00   6119.00   4147.00 ...

4. View the consensus and anticonsensus sequences.

>>> m.consensus
Seq('AAAAAAGAGGAAGTGAAAAA')

>>> m.anticonsensus
Seq('CCCCCCTCCCTCTCTTCCCC')

5. Calculate the total number of sequences used by the motif by adding the counts on the first position.

>>> m.counts[:, 0]
{'A': 42201.0, 'C': 22587.0, 'G': 38405.0, 'T': 30010.0}

>>> sum(m.counts[:, 0].values())
133203.0

So 133,203 DNA sequences were used to create this motif. Those sequences are not included in the input file.

>>> type(m.instances)
<class 'NoneType'>

 

Motif PCM, PFM, PPM, PWM with Bio.motifs

Motif Counts and Consensus with Bio.motifs

Biopython for Sequence Motif Analysis

⇑⇑ OBF (Open Bioinformatics Foundation) Tools

2023-07-05, 308🔥, 0💬