Compare Motifs Using PSSM with Bio.motifs

Q

How to Compare Motifs Using PSSM with Bio.motifs?

✍: FYIcenter.com

A

If you know PSSMs of two motifs, you can compare them using the PSSM's dist_pearson() function. It returns a position offset for the best alignment and a distance between the two motifs.

1. Create a shorter motif from a given PCM without actual instances.

fyicenter$ python
>>> from io import StringIO
>>> pcm_s = """>Sorter
... A  [ 3.00   7.00   0.00   2.00   1.00 ]
... C  [ 0.00   0.00   5.00   2.00   6.00 ]
... G  [ 0.00   0.00   0.00   3.00   0.00 ]
... T  [ 4.00   0.00   2.00   0.00   0.00 ]
... """
>>> handle = StringIO(pcm_s)
>>> m_s = motifs.read(handle, "jaspar")

2. Create a longer motif from a given PCM without actual instances.

>>> pcm_l = """>Longer
... A  [ 30.00   0.00   0.00 100.00   0.00   0.00   0.00  0.00 15.00 ]
... C  [ 10.00   0.00   0.00   0.00 100.00 100.00 100.00  0.00 15.00 ]
... G  [ 50.00   0.00   0.00   0.00   0.00   0.00   0.00 60.00 55.00 ]
... T  [ 10.00 100.00 100.00   0.00   0.00   0.00   0.00 40.00 15.00 ]
... """
>>> handle = StringIO(pcm_l)
>>> m_l = motifs.read(handle, "jaspar")

3. Calculate their PSSMs with the same pseudocounts and background.

>>> pseudocounts = {"A": 0.6, "C": 0.4, "G": 0.4, "T": 0.6}
>>> background = {"A": 0.3, "C": 0.2, "G": 0.2, "T": 0.3}

>>> pwm_s = m_s.counts.normalize(pseudocounts)
>>> pssm_s = pwm_s.log_odds(background)

>>> pwm_l = m_l.counts.normalize(pseudocounts)
>>> pssm_l = pwm_l.log_odds(background)

4. Compare them by calling the dist_pearson() function.

>>> distance, offset = pssm_l.dist_pearson(pssm_s)

>>> distance
0.23924403149343054

>>> offset
2

The distance is actually 1.0 − r, where r is the Pearson correlation coefficient (PCC), between consensus sequences of the two motif aligned with padding of background distribution on the shorter motif.

>>> m_s.consensus 
Seq('TACGC')

>>> m_l.consensus 
Seq('GTTACCCGG')

# alignment using b as background distribution
m_s: bbTACGCbb
m_1: GTTACCCGG

 

Motif ICM with Bio.motifs

Search for Motif Matches with Bio.motifs

Biopython for Sequence Motif Analysis

⇑⇑ OBF (Open Bioinformatics Foundation) Tools

2023-06-19, 292🔥, 0💬