What does the domain prediction confidence mean?

Biotech > FAQ > Robetta Server FAQ (Frequently Asked Questions)

To see other biotech frequently asked questions, please visit http://biotech.fyicenter.com/faq/

(Continued from previous question...)

What does the domain prediction confidence mean?

The domain prediction confidence has different meanings depending on the method used to detect the region. Those regions with a detected parent PDB structure are meant to have a confidence function that follows a similar trend. The confidence value is derived based on the detection method in the following way:

PDB-BLAST homology conf = -log(e-val) e.g. e=.001 -> conf=3.0 (strong detection threshold)

HHSEARCH homology conf = hhsearch_prob/42.5 e.g. prob=85.0 -> conf=2.0 (strong detection threshold)

Pfam de novo conf = -log(e-val) e.g. e=.001 -> conf=3.0 (strong detection threshold)

msa de novo conf = block_depth +
    .001*block_occ +
    .000001*e_val_pref +
    .000000001*block_len note: dominated by nr50 block depth

cutpref de novo conf = 0 note: domain boundaries solely determined by sequence transitions, strongly predicted loop, occupancy, and distance from nearest block or terminus

The general trend with the homology modeling detections allows one to discriminate likely correct parents from improbable ones. If the confidence for a parent PDB is >= 3.0, then it's almost certainly the right fold, and the model itself probably does a good job capturing the features of the structure. Between 2.0 and 3.0, it's usually the right fold, but the model quality is likely to be reduced. Between 1.0 and 2.0, the fold is still right more than half the time, but even so the models produced are often not as good as they could be in cases where the fold is correct, due to the difficulty that homology modeling faces at greater distance. Therefore, in this extreme twilight-zone regime, Robetta also provides de novo models for such domains.

(Continued on next question...)