Collections:
Motif Counts and Consensus with Bio.motifs
How to Get Motif Counts and Consensus with Bio.motifs Module?
✍: FYIcenter.com
Motif counts represent how often each letter appears at each position in a motif sample set. Motif counts is also called PFM (Position Frequency Matrix).
Motif consensus is the sequence of letters along the positions of the motif for which the largest value in the corresponding columns of the motif count is obtained. Basically, the motif consensus is the sequence with highest probability based on the given motif sample set. Or the motif consensus is the most likely sequence appearing in the entire population.
Motif anticonsensus is the sequence of letters along the positions of the motif for which the smallest value in the corresponding columns of the motif count is obtained. Basically, the motif consensus is the sequence with lowest probability based on the given motif sample set. Or the motif consensus is the most unlikely sequence appearing in the entire population.
1. Create a motif object with 7 sequences that matches the motif pattern of "[AT]A[CT][ACG][AC]".
fyicenter$ python >>> from Bio import motifs >>> samples = [ ... "TACAA", ... "TACGC", ... "TACAC", ... "TACCC", ... "AACCC", ... "AATGC", ... "AATGC" ... ] >>> m = motifs.create(samples)
2. View motif counts.
>>> print(m.counts) 0 1 2 3 4 A: 3.00 7.00 0.00 2.00 1.00 C: 0.00 0.00 5.00 2.00 6.00 G: 0.00 0.00 0.00 3.00 0.00 T: 4.00 0.00 2.00 0.00 0.00
3. View motif consensus.
>>> print(m.consensus) TACGC
4. View motif anticonsensus.
>>> print(m.anticonsensus) CCATG
5. If a position has multiple letters with same highest count, Biopython will select one of those letters.
>>> samples = [ ... "TACAA", ... "TACGC", ... "TACAC", ... "TACCC", ... "AACCC", ... "AATGC", ... "AATGC", ... "AACGC" ... ] >>> m = motifs.create(samples) >>> print(m.consensus) AACGC
As you can see, position 1 has both A and T with the highest count of 4. Biopython selects A.
⇒ Read Motif in JASPAR Format with Bio.motifs
⇐ Create Motif With Biopython Bio.motifs Module
2023-07-05, 302🔥, 0💬
Popular Posts:
Molecule Summary: ID: FYI-1001083 SMILES: Cc1cn([C@@]2([H])C[C@@]( [H])([C@@]([H])(CO[C@@]3 ([H])[C@@]...
Molecule Summary: ID: FYI-1000299 SMILES: O=CC(O)C(O)C(O)CO Received at FYIcenter.com on: 2021-03-21
Molecule Summary: ID: FYI-1002226 Names: InChIKey: RJNJNLPEYZFHEL-HKWRFOASS A-NSMILES: CC5=C(C(N)=O)...
Molecule Summary: ID: FYI-1003052 Names: InChIKey: KXKFKKUVLLNNBK-SKDRFNHKS A-NSMILES: NC(=O)[C@@H]3...
Molecule Summary: ID: FYI-1002062 Names: GLEEVEC; InChIKey: YLMAHDNUQAMNNX-UHFFFAOYS A-NSMILES: CC1=...