1. Open an Existing Project
Instructions for the installation of PEAKS can be found in “Chapter 2, Installation and Activation” of the User Manual. After installation and running PEAKS, you can open the sample project by one of the following two ways (ses screenshot below):
- If this is a fresh installation, click the “Sample Project” in the “Recent Projects” list of the Start Page.
- Click the open project button, and browse to the directory where PEAKS was installed, select “SampleProject” and click the open button in the file browser.
2. Main GUI
The main graphical user interface (GUI) of PEAKS is divided into several areas (see screenshot below):
- The project tree shows all the opened projects. Each project may include multiple samples and each sample may include multiple fractions (LC-MS runs). The analysis results are also displayed as result nodes under the project.
- The menu and toolbar. Selecting a node (project, sample, fraction, or result) in an opened project will highlight the common analysis tool icons available to the selected node.
- A result node in a project can be opened by double clicking the node. All opened result nodes are shown here as different tabs.
- Each opened result node provides several different “views” as different tabs. In particular, the summary view shows the result statistics. The summary view is also the central place to filter and export the results.
- The information pane to show useful information such as node property and progresses of running tasks.
3. Result Summary and Filtration
After opening a result node by double clicking it, say, the “SPIDER” node in the sample project, the default view of the opened result node is the summary view. The summary view provides mainly three functions:
- Specify score thresholds to filter the results
- Examine the result statistics
- Export results
The top region of the summary view is a control pane and the bottom region is a statistics report page. The result filtration is controlled at the top control pane (see screenshot below):
- The peptide identification is filtered by the peptide-spectrum match’s -10lgP score. Or, one can simply specify the desired FDR (false-discovery rate) by clicking the FDR button.
- The protein identification is filtered by the protein’s -10lgP score and the number of unique peptides the protein contains.
- The de novo only peptides are those with confident de novo sequence tags but cannot be identified by other algorithms used for database search. To report a de novo only peptide, the TLC (total local confidence) and ALC (average local confidence) scores must be better than or equal to the specified threshold. Meanwhile, the spectrum’s best database search result’s score should be no greater than the specified -10lgP threshold.
- TLC measures the approximate number of correct amino acids in the de novo sequence, and ALC measures the approximate percentage of correct amino acids in the de novo sequence.
- By default, the -10logP threshold used for de novo only is locked to be the same as the -10lgP threshold used for filtering peptides. To specify a different value, first click the lock icon to unlock it.
After the filtration criteria are changed, the “Apply Filters” will change to red. Click it to apply the new criteria.
The top control pane has two additional buttons: “Export” and “Notes”. The result can be exported by clicking the “Export” button. The Notes button there allows you to type in a text note about the project, which will be displayed in the result summary report.
After applying filters, the statistics report page at the bottom of the summary view will be updated accordingly. We only explain two statistical charts here (see screenshot below). Figure 2 (a) shows the PSM score distribution. If the search result and the peptide -10lgP score threshold is of high confidence, then you should observe very few decoy matches (brown) in the high score region. Additionally, if the FDR estimation method (decoy fusion) worked properly, then you should observe a similar or larger number of decoy (brown) matches than target (blue) matches in the low score region.
Figure 2 (b) plots the precursor mass error v.s. score for all the PSMs above the -10lgP score threshold. This figure is the most useful for the high resolution instruments. Generally you should see that the high-scoring points are centered around the mass error 0. And only below a certain score threshold the data points start to scatter to have bigger mass error.
4. Result Visualization
Besides the summary view, there are three other views, “protein”, “peptide”, and “de novo only”, for visualizing the results in different ways:
- The protein view contains a list of proteins passing the filtration. The proteins identified with the same set (or a subset of) peptides are grouped together.
- The peptide shows all the peptide identifications passing the filtration. The multiple spectra that identified the same peptide sequence are grouped together.
- The de novo only view shows all the peptides identified exclusively by de novo sequencing.
Here we only focus our attention on the new protein coverage view in PEAKS. Click the protein view tab, and select one protein. The following protein coverage will show at the bottom of the protein view. The protein coverage view maps all peptide identifications of the selected protein onto the protein sequence. It enables the effortless examination of every PTM and mutation on each amino acid. Some most commonly used operations on this protein coverage view are listed in the following (see screenshot below):
- Each blue bar indicates an identified peptide sequence. A gray bar indicates a de novo only tag match.
- Peptide identifications with the same amino acid sequence and the same “interesting” PTMs are grouped together and displayed as a single bar. A PTM is “interesting” if it’s checked in the display option (see item 5).
- PTMs and mutations are highlighted with colored icons and white letter boxes. Highly confident PTM and mutations are displayed on top of the protein sequence.
A PTM or mutation is regarded as confident if the two fragment ions at both sides of the modified residue have relative intensity higher than user-specified threshold in the display option (see item 5).
- Click a peptide to show the spectrum annotation.
- Mouse over an amino acid to show the supporting fragment ion peaks.
- Options to control the coverage view display.
- The “coverage/outline” choice turns on/off the peptide bars.
- The “de novo only tag” specifies the minimum number of consecutive amino acid match between a de novo only sequence and the protein before it can be displayed as the gray bar.
- The “confident PTM” specifies the minimum fragment ion relative intensity in one of the MS/MS spectra before a PTM location is regarded as confident, and displayed on top of the protein sequence.
- The check boxes in the PTM list specifies which PTMs are “interesting”. Click the color boxes to change a color. Double click a PTM name to see the PTM detail.
- The full screen button and tool box button.
Full screen provides a larger view of the coverage. The tool box provides some common tools, such as to export the coverage as a high-resolution image file.
5. Creating a PEAKS Project
To create a new PEAKS project from raw data files, follow the following steps (see screenshot below):
- Click the new project button at the tool bar.
- Click the “Add sample” and “Add data file” buttons to add samples to the project and data files to each sample.
- For each sample, specify the sample details.
6. Conduct an Identification Analysis
To conduct a complete identification analysis using the PEAKS workflow:
- Select a project, sample, or result node from the project tree
- Click the desired analysis tool button
A search parameter pane will pop up. Most search options are standard and straightforward. More details are provided in the following (see screenshot below):
- If the proteolysis enzyme was specified for each sample at the project creation step, one can choose to use enzyme specified in each sample. This makes it possible to use multiple enzymes in a single project and a single search.
- Specify the fixed PTMs and a few common variable PTMs expected in the sample.
- Select a protein sequence database, or copy and paste the protein sequences for the database search.
- Conduct de novo sequencing using the same parameters, or base the search on an existing de novo sequencing result node.
- Estimate the false discovery rate (FDR) with the decoy fusion method. Decoy fusion is an enhanced target-decoy method for result validation with FDR. Decoy fusion appends a decoy sequence to each protein as the “negative control” for the search. See BSI’s FDR tutorial for more details.
- Including PEAKS PTM and SPIDER algorithms for the search.
By default, PEAKS PTM performs blind search for additional PTMs in the data. Users can also limit the PEAKS PTM search on a large number of PTMs by clicking the “Advanced Setting” button.
SPIDER performs homology search based on de novo sequencing tags. If selected, SPIDER algorithm will be conducted on every confident de novo tag (ALC > 30%) whose spectrum is not identified by PEAKS DB with high confidence (-10lgP < 30). SPIDER will construct new peptide sequences by altering amino acids of database peptides. For each spectrum, the better sequence constructed by SPIDER or found by PEAKS DB will be used as the identified peptide. SPIDER is good for cross-species search and for finding point mutations of the protein. It makes no difference to invoke SPIDER through this workflow or by clicking the SPIDER icon in the toolbar.