Overview
Structure-based drug design approaches have been an integral part of the drug discovery process and have made a profound contribution at many pharmaceutical companies. As more and more structural information becomes available, a variety of practical structure-based techniques have been developed to accelerate the development at all stages of the drug discovery process. In spite of that, structure-based drug design requires a significant amount of structural resources from experimental methods. The recent explosion in genomic data has resulted in millions of protein sequences and researchers cannot afford to perform XRAY or NMR analysis on every protein. Currently, only 40 000 structures are stored in the PDB database. Consequently, pharmaceutical companies are increasingly turning to bioinformatics technologies which can reduce drug discovery and development costs. The practical role of protein structure prediction is now more important than ever.
Given a target protein sequence, if there is a homologous protein with known 3D structure, it can be found by using sequence search tools like PSI-BLAST or BLASTP. The target's structure is then built from the known structure. However, when the sequence homology is not significant, i.e. less than 25%, PSI-BLAST and BLASTP can not come up with a confident hit. Unlike PSI-BLAST or BLASTP, which simply does a sequence search, protein threading (fold recognition) makes use of both homology and structure information. It scans the protein sequence with an unknown structure against a database of known structures. By using a scoring function and conducting compatibility analysis between three-dimensional structures and linear protein sequences, the best structural template will be identified from which to build the sequence's structure. As a result, protein threading gives a superior prediction than homology modeling when there is marginal sequence homology. A comparison between RAPTOR and PSI-BLAST in CASP competitions can be found here.
RAPTOR is an innovative software tool designed for accurate protein structure prediction. It combines advanced analysis tools in one integrated software solution and provides three different threading methods. RAPTOR's unique integer programming optimization approach is most effective for finding structure templates of targets with low sequence homology and is able to generate high quality models. The easy-to-use interface and easy-to-understand E values are ideal for beginners and experts. Above all, the intuitive display of the output enables users to understand the results simply at a glance.
Features
Unique Threading Algorithm
- RAPTOR offers three threading algorithms: two based on dynamical programming (DP) and one based on integer programming (IP). The DP algorithms effectively handle most easy sequences (high homology). For hard sequences (low homology), the IP algorithm gives confident prediction.
Supreme Accuracy
- RAPTOR has superior recognition of structure similarities at the fold, family and super family levels. It delivers high accuracy alignments and models, as demonstrated in recent community-wide tests.
Conservation Discovery
- RAPTOR has proven its ability to find structural conservation that is otherwise unnoticed by most threading algorithms. RAPTOR’s unique integer programming algorithm takes pairwise contact potential into consideration when doing threading, which has greatly enhanced the prediction accuracy.
Up-To-Date Template Library
- The structure template library used in RAPTOR is a representative subset of the PDB database. As the PDB is constantly updated, BSI provides template updates every three months.
Easy to Install and Use
- Installation of RAPTOR is fully automated and users just need to specify the installation path. A friendly GUI enables even beginners to start quickly. With the preset configuration, users simply need to click the Run button to obtain final results.
Intuitive Output
- The intuitive output makes it possible to understand relevant data at a glance. E-values enable users to easily judge the significance of the prediction. The color matrix representing sequence profiles help users to easily identify the conservative residues. The colors representing different types of secondary structures in the PIR format alignment allow users to compare the target and template comprehensively. Rasmol is used to help users to examine the predicted structures in 3D space.
Flexible Interface
- RAPTOR works seamlessly and can be configured to call Modeller automatically. If you use ICM Pro, RAPTOR can export ICM PRO compatible input files so you can easily pipeline RAPTOR with ICM Pro.
Extensive Platform Support
- RAPTOR supports both Windows and Linux. The latest versions of RAPTOR have been fully tested on Windows XP and 2000. For Linux, current versions of RAPTOR have been tested on Redhat, Fedora, Debian, and Suse.
Parallel Computing
- The threading process is parallelized in RAPTOR. RAPTOR can run in a distributed environment, clusters or on SMP machines, which will substantially reduce run time.
Case & Scenario Application
Basically, when you have a protein sequence with an unknown structure, RAPTOR can predict its structure for you in hours, compared with NMR or X-Ray techniques which are costly and takes several months. Overall, RAPTOR will save you and your lab a lot of time, money and effort. This is especially useful for academic users who want to publish as soon as possible.
The structural information predicted by RAPTOR can be used in many different areas. Here are some examples from our current customers. This is not a full picture about using RAPTOR but it will help to find out how RAPTOR can help your research.
Protein Threading
- Protein Threading is the most fundamental use of RAPTOR for everyday use. When there is marginal sequence homology, your homology search tool (i.e. BLAST) may fail. In that case, you should try RAPTOR. When the sequence homology is low (below 25%), RAPTOR consistently gives a constructive prediction.
Functional Annotation
- This is based on the observation that proteins with similar structures have similar functions. If there is a protein sequence with an unknown structure and you want to know its function, you can run RAPTOR with the sequence. The top template (whose function is already known) returned by RAPTOR is believed to have a similar structure with the query sequence. Consequently, the sequence may have the same function (active site) as the template. For more information about functional annotation, see the RAPTOR Pipeline.
Identifying Putative Distant Relationships
- Suppose you have several protein families that you have already studied thoroughly. Now you want to identify some putative distant members of the protein families. You can create your own template library by making the proteins in your protein families’ templates. After that, you can run RAPTOR to scan the sequence database against your own template library. In this way, you will filter out those sequences that have distant relationships with your protein families.
Secondary Structure Prediction
- RAPTOR uses secondary structure prediction results when threading. When 3D structure prediction has been obtained, RAPTOR’s prediction can be used to help further improve secondary structure prediction accuracy.
Crystallography
- When crystallographers try to obtain X-ray crystallography of a protein, they need to collect as much 3D information as early as possible. The plausible structural template identified by RAPTOR can be used to model the structure of the protein. For example, it can be used in molecular replacement phasing.