|
Benchmarking is the process of comparing one method's results against another.
CASP (Critical Assessment of Techniques for Protein Structure Prediction) is a community wide experiment which is held every two years by NIH. All participatory 3D protein structure prediction servers will be assessed in the experiment. RAPTOR has been an active participant since CASP5 in 2002.
The targets used in CASP have been classified into two groups: Homology Modeling (for easy targets) and Fold Recognition (for hard targets). Servers are evaluated for each group.
Lindahls and Fischer et al.
For those unfamiliar with CASP, here is a more straight forward benchmark set of results, with explanation.
Fischer et al. benchmark set consists of 68 target sequences and 301 templates. RAPTOR ranks 56 pairs out of 68 pairs as top 1, achieving about 82% prediction rate. The fold recognition performance of RAPTOR was further tested on Lindahls benchmark set consisting of 976 protein sequences. By threading them all against all, there are 976 975 threading pairs. We measured RAPTOR's performance in three similarity levels: fold, superfamily and family. Results are shown in Table 1. The results of other methods are taken from Shi et al’s paper.1 Prediction correctness is assessed based on the SCOP classification.
|  Method  |
 Family  |
 Superfamily  |
 Fold  |
|
|  Top 1 
|  Top 5 
|  Top 1 
|  Top 5 
|  Top 1 
| nbspTop 5 
| RAPTOR |
83.7 |
86.4 |
55.0 |
67.7 |
39.6 |
61.9 |
| FUGUE |
82.2 |
85.8 |
41.9 |
53.2 |
12.5 |
26.8 |
| PSI-BLAST |
71.2 |
72.3 |
27.4 |
27.9 |
4.0 |
4.7 |
|  HMMER-PSIBLAST  |
67.7 |
73.5 |
20.7 |
31.3 |
4.4 |
14.6 |
|  SAMT98-PSIBLAST  |
70.1 |
75.4 |
28.3 |
38.9 |
3.4 |
18.7 |
| BLASTLINK |
74.6 |
78.9 |
29.3 |
40.6 |
6.9 |
16.5 |
| SSEARCH |
68.6 |
75.7 |
20.7 |
32.5 |
5.6 |
15.6 |
| THREADER |
49.2 |
58.9 |
10.8 |
24.7 |
14.6 |
37.7 |
Table 1. The performance of RAPTOR at three different similarity levels shown in Table 1, RAPTOR performs better than other methods at all similarity levels (especially the fold level). At the family level, RAPTOR's recognition performance is comparable to that of FUGUE, the best method for family and superfamily level other than RAPTOR. We may conclude that a strict treatment of pairwise interactions is necessary for fold and superfamily level recognition. For the family level, sequence (or profile) alignment could attain satisfactory results.
Specific Examples
We now present several structure prediction examples generated by RAPTOR in CAFASP3 and LiveBench6. Most of CAFASP3 targets experimental structures are not allowed to be published so far. Therefore we chose some targets from LiveBench.
Figure 1 (taken from CAFASP3s website, generated by RasMol and MaxSub) presents the superimposition between the experimental structure (grey color) and RAPTORs predicted structure (black color) of T0136 1. According to MaxSubs evaluation, 17 of 54 servers generated correct fold recognitions for this target and RAPTOR produced the best alignment among all. MaxSub could superimpose a segment of 118 residues (sequence size is 144) of the predicted structure to the experimental structure with an RMSD of mere 1.9A.

Fig. 1. The superimposition of experimental structure (grey color) and prediction structure (black color) of CAFASP3 target T0136 1.
The following two figures are generated by RasMol based on evaluation results of LiveBench6. Figure 2 shows an almost perfect prediction for target 1ll8A. The alignment accuracy score measured by MaxSub is more than 9 (scale 10). Figure 3 presents a good structure prediction for target 1j53A, with an alignment accuracy score of more than 6. Considering the length of the target sequence, this prediction is considered very successful.
 
Fig. 2. The experimental structure (left) and the predicted structure (right) of 1kvzA.
 
Fig. 3. The experimental structure (left) and the predicted structure (right) of 1j53A.
Computing Efficiency Issues
A key advantage of our algorithm is that the memory requirement is just about O(|A| n2), where A is the edge set of the contact graph of a protein template structure and n is the query sequence length. The observed memory usage is 100~200M for most threading pairs. In practice, the computing time does not increase exponentially with respect to target sequence size. Figure 4 shows the CPU time of threading 100 sequences (chosen randomly from Lindahls benchmark) with size ranging from 25 to 572 to a typical template 119l of length 162 (here CPU time was measured on a single 400MHz MIPS R12000 CPU of a Silicon Graphics Origin 3800 system with 20GB of RAM). It shows that the computing time of our algorithm increases very slowly with respect to sequence size. In fact, we found that for real protein data, our relaxed linear programs directly output integral solutions 99% of the times and generated only a few branch nodes when the solution was fractional.
Figure 5 shows the CPU time used for the prediction of each CAFASP3/CASP5 target sequence. There were in total 62 targets and 3236 protein templates in our template database. It shows that CPU time increased very slowly with respect to sequence size except for one target (t0174) that took about 45 hours. After careful inspection, we found that there were 30 templates, each of which took about 15 hours threading time. These templates are up for further examination.
Conclusions
In this paper, we have presented performance benchmarks of the software package RAPTOR, which adopts a novel integer programming approach to treat pairwise interactions rigorously in protein threading. Experimental results show that RAPTOR performs very well in terms of alignment accuracy and fold recognition for FR targets. As for computational efficiency, RAPTOR is also much better than algorithms that treat the pairwise potentials strictly when dealing with templates with complex interaction topology and long sequences.

Fig. 4. CPU time of threading 100 sequences to template 1191 (lus=0.01s).

Fig. 5. CPU time of threading 62 CAFASP3 target sequences to 3236 templates.
Footnotes:
- Shi J, Blundell TL, Mizuguchi K. FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. . J Mol Biol. 2001 Jun 29;310(1):243-57.
- Fischer D, Elofsson A, Rice D, Eisenberg D. Assessing the performance of fold recognition methods by means of a comprehensive benchmark. Pac Symp Biocomput. 1996:300-18.
- Lindahl E, Elofsson A. Identification of related proteins on family, superfamily and fold level. J Mol Biol. 2000 Jan 21;295(3):613-25.
- Xu J, Li M, Kim D, Xu Y. RAPTOR: Optimal Protein Threading by Linear Programming, J. Bioinformatics and Computational Biology, Vol. 1, No. 1 (2003) 95-117
- Xu J, Li M. Assessment of RAPTOR's linear programming approach in CAFASP3, Proteins: 53(S6): 579-584. 2003
|