Mass spectrometry based peptidomics has become an attracted approach for developing peptide-based vaccines and immunotherapies. This field of research has also led to many discoveries of non-canonical peptide expression and has changed our understanding of the genomic regions that are read and translated. Assigning the correct amino acid sequence to an MS/MS spectrum is critical in peptidomics, since the positions and identity of each residue can influence peptide-protein interactions and define whether it is a good candidate for downstream analysis. With gaining popularity of peptidomic and immunopeptidomic analyses, it is essential to have a software tool that efficiently processes this type of data in an accurate and intuitive manner.

In the mass spectrometry analysis of peptidomics data, PEAKS Software has become the preferred data analysis solution and, as part of PEAKS Studio 11 and PEAKS Online 11, we have a workflow dedicated for this named DeepNovo Peptidome.

Key Features

  • Integration of de novo sequencing, database search, and homology search in one workflow
  • FDR control for de novo peptides
  • Deep learning model trained with large HLA peptide dataset for retention time, fragmentation ion, and ion mobility predictions
  • Considered three different categories of PTMs to predict the retention time and MS2 spectra of modified peptides more accurately

Previously, false discovery rate estimation was not an option for de novo sequenced peptides, therefore the user would choose an average local confidence score threshold to filter out poor quality. By appending de novo sequences to a protein sequence database and performing a second-round database search, FDR is estimated for de novo peptides to increase the accuracy of identification.

DeepNovo Peptidome provides a finalised list of peptides that can be sorted based on the way they were identified: database search, homology search (peptides with mutations), or de novo sequencing alone. The summary page of the results allows the user to evaluate how well the retention times of peptides from different classifications fit to the retention time predictions. By selecting any particular peptide in the result list, the m/z and intensity values for fragmentation ions in each spectrum can be compared to the predicted values. For ion mobility data, collisional cross section (CCS) values are also predicted for each peptide.

A new result filter, called confident amino acid percentage (CAA%), was added in DeepNovo Peptidome to allow the user to assess the percentage of amino acids in a peptide spectrum that are supported by fragmentation ions with intensity values above a specified threshold. This provides increased confidence in peptide sequence validation. The final peptide list can be directly exported and used for downstream analysis, such as binding affinity and immunogenicity predictions.


DeepNovo Peptidome PTM analysis

PEAKS 11 DeepNovo Peptidome workflow now considered three different categories of PTMs to predict the retention time and MS2 spectra of modified peptides more accurately. These categories include standard PTMs, isotopic labels, and artificial modifications (i.e., TMT tags). Isotopically labelled peptides are often spiked into immunopeptidomics samples prior to LC-MS/MS analysis to validate the identification of immunopeptides. However, amino acid isotopes do not affect retention time of peptides as do most biologically relevant PTMs or artificial tags. Therefore, the deep learning model in DeepNovo Peptidome workflow is trained to predict the retention time and MS2 spectra of peptides that fall into these three different categories to improve peptide scoring and identification.


DeepNovo Peptidome for DIA Data (PEAKS Online Only)

The new PEAKS Online features a novel DIA peptidome workflow for immunopeptidomics that integrates library-free database search and de novo sequencing by stream-lined DIA data analysis [3]. Peptides with high confidence were collected and re-scored based on the prediction of peptide physic-chemical properties (including the fragment ion pattern, retention time, and ion mobility) and finally were reported with a false discovery rate. Furthermore, the positional confidence scores are associated with each peptide to evaluate the accuracy of each amino acid. The workflow presents a computational pipeline to aid identification of peptides generated by noncanonical translation of mRNA or by genome variants with LC-MS/MS DIA data.

DeepNovo Peptidome DIA Workflow

References & Resources

References

  1. Tran NH, Qiao R, Xin L, Chen X, Liu C, Zhang X, Shan B, Ghodsi A, Li M, Deep learning enables de novo peptide sequencing from data-independent-acquisition mass spectrometry, Nat. Methods, 16, 63–66 (2019). https://doi.org/10.1038/s41592-018-0260-3
  2. Tran NH, Zhang X, Xin L, Shan B, Li M. De novo peptide sequencing by deep learning. Proc. Natl. Acad. Sci. U.S.A. 114, 8247-8252 (2017). https://doi.org/10.1073/pnas.1705691114
  3. Xin, L. Qiao R, Chen X, Tran NH, Pan S, Robinoviz S, Bian H, He X, Morse B, Shan B, Li M. A streamlined platform for analyzing tera-scale DDA and DIA mass spectrometry data enables highly sensitive immunopeptidomics. Nat. Commun. 13, 3108 (2022). https://doi.org/10.1038/s41467-022-30867-7

Resources