Collections:
Compare Motifs Using PSSM with Bio.motifs
How to Compare Motifs Using PSSM with Bio.motifs?
✍: FYIcenter.com
If you know PSSMs of two motifs, you can compare them using the PSSM's dist_pearson() function. It returns a position offset for the best alignment and a distance between the two motifs.
1. Create a shorter motif from a given PCM without actual instances.
fyicenter$ python >>> from io import StringIO >>> pcm_s = """>Sorter ... A [ 3.00 7.00 0.00 2.00 1.00 ] ... C [ 0.00 0.00 5.00 2.00 6.00 ] ... G [ 0.00 0.00 0.00 3.00 0.00 ] ... T [ 4.00 0.00 2.00 0.00 0.00 ] ... """ >>> handle = StringIO(pcm_s) >>> m_s = motifs.read(handle, "jaspar")
2. Create a longer motif from a given PCM without actual instances.
>>> pcm_l = """>Longer ... A [ 30.00 0.00 0.00 100.00 0.00 0.00 0.00 0.00 15.00 ] ... C [ 10.00 0.00 0.00 0.00 100.00 100.00 100.00 0.00 15.00 ] ... G [ 50.00 0.00 0.00 0.00 0.00 0.00 0.00 60.00 55.00 ] ... T [ 10.00 100.00 100.00 0.00 0.00 0.00 0.00 40.00 15.00 ] ... """ >>> handle = StringIO(pcm_l) >>> m_l = motifs.read(handle, "jaspar")
3. Calculate their PSSMs with the same pseudocounts and background.
>>> pseudocounts = {"A": 0.6, "C": 0.4, "G": 0.4, "T": 0.6} >>> background = {"A": 0.3, "C": 0.2, "G": 0.2, "T": 0.3} >>> pwm_s = m_s.counts.normalize(pseudocounts) >>> pssm_s = pwm_s.log_odds(background) >>> pwm_l = m_l.counts.normalize(pseudocounts) >>> pssm_l = pwm_l.log_odds(background)
4. Compare them by calling the dist_pearson() function.
>>> distance, offset = pssm_l.dist_pearson(pssm_s) >>> distance 0.23924403149343054 >>> offset 2
The distance is actually 1.0 − r, where r is the Pearson correlation coefficient (PCC), between consensus sequences of the two motif aligned with padding of background distribution on the shorter motif.
>>> m_s.consensus Seq('TACGC') >>> m_l.consensus Seq('GTTACCCGG') # alignment using b as background distribution m_s: bbTACGCbb m_1: GTTACCCGG
⇐ Search for Motif Matches with Bio.motifs
2023-06-19, 292🔥, 0💬
Popular Posts:
Molecule Summary: ID: FYI-1000270 SMILES: N[17C@@](F)([18C])C(=[19 0O])[20O]Received at FYIcenter.co...
Where to find tutorials on molecule visualization software PyMol? I want to know how to use PyMol. H...
Molecule Summary: ID: FYI-1000310 SMILES: Cn1ccnc(c1=O)NC(=O)[C@@H ]2CC(=O)N(C2)CCc3ccccc3Received a...
Molecule Summary: ID: FYI-1000308 SMILES: CN(C)c1ccc(cc1)/C=C(/C(= O)N/N=C/c2cc(ccc2OC)Br)\ \NC(=O)c3c...
Molecule Summary: ID: FYI-1000269 SMILES: N[17C@@](F)([18C])C(=[19 12O])[20O]Received at FYIcenter.c...