Database Search
PEAKS Video Tutorials Download PEAKS Software

PEAKS DB: Database Searching

Database searching is a standard way to identify the peptides whose sequences are in a database. It is also an important function of the PEAKS software. While database searching cannot discover new peptides, it is a great way to confirm the existence of expected peptides and PTMs. It also serves the foundation of many other proteomics analyses, such as protein quantification and deep replication.

Accuracy and Sensitivity

Given the number of database searching software on the market, the false discovery rate (FDR) curves have been used to systematically compare different search engines' accuracy and sensitivity simultaneously [1,2]. It has been shown that the PEAKS DB database searching algorithm included in PEAKS can achieve significantly better FDR curve than the more traditional database search software [2] (Figure 1). This means more peptides can be identified at the same or lower FDR.


PEAKS DB FDR Benchmark

Figure 1: PEAKS FDR Curve Benchmark

This significant improvement of search accuracy and sensitivity was achieved by the unique approach PEAKS DB algorithm employs. It uses de novo sequences to validate the database search result. Since de novo sequencing does not look into the database, the matching between a de novo sequence and a database search result is unlikely a random event and usually indicates a correct identification of the peptide. By incorporating PEAKS' high quality de novo sequencing to assist the database search, as well as other improvements, PEAKS DB was able to produce a better scoring function that better separates the true and false identifications (Figure 2). The outcome is a significantly improved FDR curve.


PEAKS Better Separates True and False Identifications

Figure 2: Use de novo assisted database results to produce better scoring function


Result Validation and Filtration

For each PEAKS DB search, the software calculates a result statistics and shows them in a summary view. Figure 3 shows a few samples of the charts calculated. From these charts, users can easily validates the overall quality of the identification and the data. Filtration the result is also as easy as a few mouse clicks to select the desired FDR.

PEAKS DB uses an enhanced target-decoy method, called "decoy fusion" [2], for FDR estimation and result validation. The decoy fusion method avoids some pitfalls in the standard target-decoy method, and is more conservative.

Summary View Statistics

Figure 3: Summary View Statistics

Result Visualization

PEAKS is well-known for its superior result visualization. Besides the above-described summary view, users can examine the results in a number of convenient ways, from different angles. In particular, the protein coverage view provides a starting point to examine all identified peptides of a protein, with the PTMs and mutations highlights. Clicking a peptide can bring up the peptide-spectrum matching annotation. Users can even examine an individual amino acid's supporting peaks by a simple mouse-over the amino acid.


De Novo only View

Figure 4: Interactive PEAKS DB Protein Coverage Pane