Integrating Database Search and De Novo Sequencing for Immunopeptidomics with DIA Approach

Shan, Paul, and Hieu Tran. “Integrating database search and de novo sequencing for immunopeptidomics with DIA approach.” Journal of Biomolecular Techniques: JBT 30.Suppl (2019): S23. PMCID: PMC6936894.

Abstract

Identification of tumor-specific antigens (neoantigens) is needed for development of effective cancer immunotherapy and a good source for such antigens are the pools of HLA-bound peptides presented exclusively by the tumor cells. Mass spectrometry (MS) has evolved as the method of choice for the exploration of the human immunopeptidome (HLA class-I and class-II peptides). The key challenge is to deal with the low abundance of these peptides. Data-independent acquisition (DIA) technology promises to capture the low abundance data. However, the high number of fragments ions generated from multiple peptide precursors contained in the same selection window complicates the data analysis in a classical database search strategy. This problem is circumvented by the use of a peptide reference spectral library, which is generated beforehand by an extensive analysis of the similar samples by DDA. An alternative is to create a pseudo-DDA dataset from the DIA data for subsequent search in way similar to the classical DDA strategy. Both approaches have a shortcoming. Peptides in the samples not present in a spectral library or sequence database in principle cannot be analyzed. To circumvent this limitation, de novo peptide sequencing is essential for immunopeptidomics. We recently reported that deep learning enables de novo sequencing with DIA data. In this work, we have developed a new integrative peptide identification method which can integrate de novo sequencing more efficiently into protein sequence database searching or peptide spectral library search. Evaluated on large real datasets our method outperforms current identification methods.