Database Searching: PEAKS DB
Molecular & Cellular Proteomics published an article written by Zhang et al. entitled PEAKS DB: De Novo Sequencing Assisted Database Search for Sensitive and Accurate Peptide Identification (10.1074/mcp. M111.010587).
The paper describes the PEAKS DB search algorithm, which utilizes de novo sequencing to assist database search. This hybrid approach not only significantly increases the accuracy and sensitivity of the database search, but also can report a list of peptides that are found only by de novo sequencing. PEAKS DB was able to find as much as 35% more peptide-spectrum matches than Mascot+Percolator from a standard benchmark dataset. The -10lgP score of PEAKS DB, and how it is converted to FDR (false discovery rate), are also introduced in the paper.
ABRF 2011
The iPRG study (via ABRF) allows researchers to benchmark against one another in their ability to accurately identify peptides. The regular study allows multiple search engines to be used to identify the maximum amount of peptides. While it is encouraged to utilize multiple engines for daily analysis, it makes identifying the contributions made by each database search engine unclear. The charts below present a streamlined version of the chart produced by the iPRG study with respect to cases where only researchers employed only one peptide identification method in order to give a clear view of indenpendent findings.
Researchers using PEAKS software performed excellently and identified a considerably high quantity of peptides with a very low false discovery rate (FDR). This is just another study illustrating the exceptional capability of PEAKS software in providing a combination of high peptide identification rate and low FDR from database searching.
The ESR or FDR chart above demonstrates the “Extraordinary Skill Rate or High False Discovery Rate“ of the user’s methods involved in the study. The red bars represent results that differ from the consensus of other engines; yellow bars represent results without consensus. The red and yellow bars can be regarded as “soft” lower and upper bounds of the FDR. iPRG requested 1% FDR. This demonstrates each method’s ability to accurately report low false positive rates. These results were obtained from iPRG slide 32.

The sample was spiked with Sigma 48 proteins to act as the true positive estimator of each methodology. The ability to accurately identify the spike, along with overall performance indicates the user’s true performance, as shown in the chart to the right. The results presented here were obtained from iPRG slide 36.

A number of official conclusions were made by the organizers; of particular interest were the following:
- People were generally over-optimistic about how reliable their results were (FDR underestimation).
- Experience with tools was thought to have contributed significantly to the success of a lab's results. To make sure you are getting the most out of PEAKS, we offer FREE regularly scheduled weekly training webinars. Click here for more details about PEAKS' free training webinars.
To view the complete results, visit this link to the original slides presented by the iPRG: iPRG2011_slides_ABRF_20110309.pdf.
We wish to be clear in presenting these findings. iPRG studies are not competitions. Leftmost positions are not meant to imply best, it just reflects the sorting criterion: total number of confident ids; this was chosen as a convenient means of sorting, and this sort is used throughout for consistency.
|