|
|
Our Research
Here's a list of the research that we wrote up along the way to building the software. Please share your research with us.
|   |
Xiaowen Liu, Yonghua Han, Denis Yuen, Bin Ma. Automated protein (re)sequencing with MS/MS and a homologous database yields almost full coverage and accuracy. Bioinformatics 2009; doi: 10.1093/bioinformatics/btp366.
Oxford Journal Link.
Motivation: The bottom-up tandem mass spectrometry (MS/MS) is regularly used in proteomics nowadays for identifying proteins from a sequence database. De novo sequencing software is also available for sequencing novel peptides with relatively short sequence lengths. However, automated sequencing of novel proteins from MS/MS remains a challenging problem.
Results: Very often, although the target protein is novel, it has a homologous protein included in a known database. When this happens, we propose a novel algorithm and automated software tool, named Champs, for sequencing the complete protein from MS/MS data of a few enzymatic digestions of the purified protein. Validation with two standard proteins showed that our automated method yields greater than 99% sequence coverage and 100% sequence accuracy on these two proteins. Our method is useful to sequence novel proteins or "re-sequence" a protein that has mutations comparing with the database protein sequence.
|
|   |
Chris Hughes, Brad Doble, Lei Xin, Clark Chen, Baozhen Shan, Bin Ma, Gilles Lajoie. SILAC Quantitation to a Depth of 3000 Proteins from a Double Knockout GSK-3 Line of Mouse Embryonic Stem Cells. Session MPB: Bioinformatics: Quantification, ASMS 2009 Poster # 056. [download 1850.555 Kb] The use of SILAC permits the monitoring of pathways on a global level. However, without the ability to quantitate on large sample sets, there is a limitation on how much information can be extracted. As is shown here, PEAKS permits large scale analysis in a streamlined fashion, comparable to that of the TPP. The dataset has recently been expanded to >800000 MS/MS spectra from ~810 gigabytes of data. Further analysis on the global phosphorylation states of detected proteins using PEAKS is ongoing in an attempt to map modification changes as a result of the GSK-3 knockout.
|
|   |
Weiwu Chen, Baozhen Shan, Jing Zhang, Eric Bonneil, Janine Voyer, Gilles Lajoie, Pierre Thibault, Bin Ma. New Algorithm for Label-Free Protein Quantification. Session MPB: Bioinformatics: Quantification, ASMS 2009 Poster # 043. [download 2184.314 Kb] Label free quantitative proteomics analysis is a flexible approach enabling the profiling of protein expression across different datasets. The success of this approach relies not only on the efficient detection of peptides over a wide range of ion abundance but also on the capability of correlating their precise coordinates in different LC-MS runs. Several approaches have been previously studied to achieve these goals including the use of normalized LC retention time for data acquired on high resolution mass spectrometry instruments [1,2]. PEAKS Q offers this new algorithm as its approach for label-free quantification. We report a new approach termed “feature vector” that analyzes multiple samples simultaneously to increase the accuracy of feature detection and the protein coverage.
|
|   |
Lei Xin, Baozhen Shan, Mingjie Xie, Gilles Lajoie, Bin Ma. PTMFinder Based on PEAKS De Novo Sequencing Result. ASMS 2009. Session MPL: Proteomics: PTM Determination (Method Development), Poster # 295. [download 944.58 Kb] Identification of post-translational modification (PTM) by tandem mass spectrometry is still a major challenge in proteomics, especially if the PTMs are unknown. In typical existing software, tandem mass spectra are searched against an enlarged-database that includes all possible combinations of modified peptides. Because the search time grows exponentially with the number of allowed modifications, only a small number of known variable modifications can be included in each search. We propose a new approach based on de novo sequencing results to identify unknown variable PTMs from an MS/MS dataset.
|
|   |
Xiaowen Liu, Baozhen Shan, Bin Ma. Modeling ETD Fragmentation with Bayesian Network for Peptide Identification. ASMS 2009. Thursday, Session ThPA: Bioinformatics IV, Poster #024. [download 886.698 Kb] For each test spectrum-peptide, we randomly mutate the peptide sequence by replacing three consecutive residues with three other residues with the same total mass. If our model is good, then it should give the mutated sequence a lower score than the real sequence. By using the score function described above, 97.3% of the mutated sequences have scores lower than or equal to the real sequence. We also compared our score function with that of PEAKS Studio 5.0. We used PEAKS Studio 5.0 to do de novo sequencing for each test spectrum. From the resulting peptide of PEAKS Studio 5.0, we use a local search method to find a better peptide based on our score function. PEAKS Studio 5.0 was able to correctly compute 40.6% of all the amino acids in the test peptides. Our strategy improved this to 48.4%.
|
|   |
Baozhen Shan, Lei Xin, Weijie Yang, Gilles Lajoie, Bin Ma. Automated Multiple Round Searches to Increase Coverage of Peptide-Protein Identification. ASMS 2009. Session ThPA: Bioinformatics IV, Poster #003. [download 1589.118 Kb] One of the challenges researchers face in mass spectrometry-based proteomics investigations is that there are often a significant amount of high-quality spectra remaining un-interpreted due to PTMs and errors in MS/MS data and protein sequence databases. Specifying many variable PTMs in the protein identification software can increase the coverage, but also drastically slow down the searching speed. This dilemma can be partially solved with a two-round search approach: the first round searches a large database with only a few PTMs, followed by a second round on only the identified proteins but with many variable PTMs specified. However, this still requires a human’s knowledge about the variable PTMs in the sample, in order to specify them correctly in the second round search. We propose to use PEAKS de novo sequencing [1] results to automatically discover the variable PTMs existing in the sample. In addition, we propose a workflow for multi-round searches which results in higher protein coverage.
|
|   |
Mingjie Xie, Weiming Zhang, Weijie Yang, Weiwu Chen, Gilles Lajoie, Bin Ma. PEAKSOnline: A Free MS/MS de novo Sequencing and Protein ID Online Public Server. ASMS 2008. WP 629. [download 550.769 Kb] By distributing the computation to multiple computers, de novo sequencing and database search throughput are increased remarkably. We describe a free server for high-throughput MS data interpretation supporting both de novo sequencing and database search approaches.
|
|   |
Bin Ma, Denis Yuen. SPIDER: Novel Scoring Function Improves Homology Searches using MS/MS de novo Sequencing Results. ASMS 2008. ThP 648. [download 832.061 Kb] Proteomic MS/MS database search algorithms rely upon existing databases and are vulnerable to mutation differences between the protein sample and the database used. The process of de novo sequencing can result in mass segment replacement errors. In a case where both of these would typically yield low confirmation, our algorithm as previously introduced, SPIDER1, finds database sequences that are homologous to the real peptide, by using the partially correct sequence tag2 (Han et al., 2005) and has proven accurate for correct peptide reconstruction from the partially correct tag and the homologous database sequence3 (ASMS 2007 poster 269). The primary objective is to develop a new score that is statistically meaningful, and can be compared across different spectra, experiments, or instruments. When the correctness probability of each amino acid in a de novo sequencing result is known, the score should also take advantage of it. Secondly to develop an efficient algorithm, based on the new score, to search for homologous peptides and reconstruct the real peptides from the partially correct de novo sequencing result.
|
|   |
Lei Xin, Gilles Lajoie, Chris Hughes, Bin Ma, Derek Smith. New Quantitation Software Package Based on PEAKS Protein ID. ASMS 2008. TP 653. [download 3719.714 Kb] Isotopic labeling for protein expression analysis has become routine for quantitative proteomics studies. Reagents such as iTRAQ, ExacTag and ICAT are common tools used in this area. Label-free techniques can also be used in cases where isotopic labeling is impractical to perform. As a subsequent step to protein identification, some search engines provide modules for quantitation analysis based on these techniques. Here, we present a new software package designed to automatically quantify proteins from experiments using isotopic labeling or label-free techniques based on PEAKS[1] protein identification results.
|
|   |
Lei Xin, Gilles Lajoie, Bin Ma. New Method for the Validation of de novo Sequencing Results. ASMS 2008, WP 645. [download 1813.76 Kb] Since de novo sequencing does not depend on protein databases, the validation and confidence methods developed in the database search approach such as the reverse-database query cannot be applied. Here we present a general validation algorithm which uses any de novo sequencing scores to calculate the correctness probabilities of each amino acid in the de novo sequencing results. In addition to result validation, these probabilities can also be used in other protein identification software such as SPIDER [1].
|
|   |
Carla M.R. Lacerda, Lei Xin, Iain Rogers and Kenneth F. Reardon. Analysis of iTRAQ data using Mascot and Peaks quantification algorithms. Briefings in Functional Genomics and Proteomics Advance Access published April 4, 2008. [download 207.511 Kb] The field of proteomics has been developing rapidly toward quantification of proteins. Despite the variety of experimental techniques available for peptide and protein labelling, there are few commercially available analytical tools with the ability to interpret data from any mass spectrometer. In this study, we compare two software packages, Mascot and Peaks, for the analysis of iTRAQ data from ESI-Q/TOF mass spectrometry. In the case of a six-protein mixture combined in a known proportion, the output of the Peaks algorithm deviated from the correct result by 14% on average, while the error of the Mascot quantification was nearly 200%.When the software were used to analyze iTRAQ data from a complex protein sample, the quantification results agreed within 20% for only 26% of the quantified proteins, showing significant differences in the two quantification algorithms. This comparison and analysis revealed major intricacies in peptide and protein quantification that must be taken into consideration for software development.
|
|   |
Denis Yuen, Bin Ma, Iain Rogers. Improving de novo Sequencing Accuracy for Ion Trap data in PEAKS Software. ASMS 2007 poster MPK . 175. [download 164.817 Kb] De novo sequencing from MS/MS data is a well used method for sequencing peptides from organisms of unknown sequences, directly from their MS/MS spectra, or identifying peptides that vary from their database equivalents by some modification or mutation. De novo sequencing programs typically require scoring functions that evaluates the fitness between a peptide sequence and the spectrum. Ma et al demonstrated that two scoring functions, used together, can improve de novo sequencing accuracy [1], but relative importance of each scoring function was not thoroughly evaluated. In this work, the optimal weighting between multiple de novo sequencing score components is trained on a large dataset, and is demonstrated to provide a significant accuracy improvement in PEAKS Studio2.
|
|   |
Weijie Yang, Denis Yuen, Bin Ma, Iain Rogers. Improving Protein Coverage by de novo Sequence Homology Searching with SPIDER. ASMS 2007 poster MPK. 176 [download 160.881 Kb] Database search of tandem-MS spectra has been a well used technique for protein identification. But several proteomics problems require more coverage and more scrutinous results than this technique can provide. Sequence homology searching based on peptide de novo sequences allow us to identify peptides that are not present in a database. This approach, when coupled with standard search techniques means we can better explain the data and improve coverage on the identified proteins. Alternatively, we can better explain peptides from organisms that are not present in any database1. In this work we build and evaluate a workflow involving PEAKS auto de novo sequencing2 and SPIDER3, a unique tool for peptide sequence tag based homology searching.
|
|   |
Iain Rogers, Michaela Scigelova, Gary Woffendin. Optimizing Data Acquisition for Automated de novo Sequencing. ASMS2007 poster. [download 930.304 Kb]
De novo sequencing enables identification of peptides and proteins from unsequenced genomes [1,2] or validation of the results of a database search [3]. To be of practical use this process must be automated, with a throughput matching that of data acquisition.
The LTQ Orbitrap™ delivers routine mass measurements with deviations of less than 3 ppm (external calibration). In the context of proteomics experiments measuring the precursor ion highly accurately means fewer false positive identifications [4, 5].
There is, however, no clear consensus regarding the benefit of MS/MS mass accuracy. This is because the accurate mass detection in the Orbitrap analyser takes longer than the fragment detection in a linear ion trap, resulting in potentially less peptides being fragmented and identified. Also, the LTQ Orbitrap can fragment peptides in the linear ion trap or in the C-trap, each method being characterised by particular spectra qualities (Figure 1).
We performed a detailed comparison of data acquisition methods on LTQ Orbitrap with respect to their suitability for automated de novo sequencing with PEAKSTM Studio 4.2 software. As this package combines de novo sequencing with BLAST searches in databases we were also interested in indications of amino acid substitutions or unexpected modifications.
|
|   |
Denis Yuen, Bin Ma, Iain Rogers. Peptide Sequence Reconstruction from de novo Sequences and their Homologues. ASMS 2007 poster ThPP . 269 [download 290.377 Kb] Because protein sequence databases will never be complete, contain gene prediction errors, and can’t account for mutations between individuals, it is often necessary to derive a peptide sequence from MS/MS data where no exact match can be found in the database. De novo sequencing provides a useful technique for sequencing peptides without a database, but completely correct sequences are difficult to find. However, when coupled with a sequence tag homology search like SPIDER1, similar peptides can be returned from a protein sequence database. Here we present a technique for constructing the real peptide sequences from de novo sequences derived by PEAKS Studio2 and homologous entries from a database.
|
|   |
Jiaxi Wang, Bin Ma, Weiwu Chen. Disulfide bonded Dipeptide Analysis with PEAKS and Q-TOF Mass Spectrometry. ASMS 2007 poster MPK . 171 [download 710.406 Kb] Proteins and peptides are commonly studied using mass spectrometry; however, the most commonly used tools for MS data analysis are built with the assumption that peptides are linear. Disulphide bonds, creating complexes involving two or more peptides bonded together, cause problems for this kind of analyses. Chemical reduction, using 1,4-dithiothreitol (DTT), can break the disulfide bonds, making
the peptides acceptable for standard analysis. But since this makes determination of the disulphide bond location more ambiguous, analysis of intact dipeptides becomes necessary. Also, since chemical reduction can be incomplete, even reduced samples can benefit from this analysis. Here we present an algorithmic solution for the analysis of MS/MS data of disulfide bonded dipeptides.
|
|   |
Bin Ma, Iain Rogers. Search for the Undiscovered Peptide; Using de novo sequencing and sequence tag homology search to improve protein characterization. Biotechniques Journal, Vol. 42, No. 5, 2007. [download 255.786 Kb] A new tool, SPIDER is used to discover hidden peptides. Using a de novo sequence and a homologous sequence from the database, SPIDER reconstructs the real peptide, highlighting mutations and allowing for de novo sequencing error.
|
|   |
Ma, B., Rogers, I. Application Note: PEAKS de novo performance on LTQ Orbitrap data. Unpublished, June 2006. [download 68.583 Kb] A demonstration of the accuracy of PEAKS de novo sequencing on a Thermo LTQ Orbitrap mass spectrometer. 97% accuracy is achieved!
|
|   |
Rogers, I., Haskins, W. Drastically increased coverage by using four search engines for Protein Identification. (Bioinformatics Solutions Inc, Genentech), ASMS 2006 poster MP328. [download 151.157 Kb] This poster demonstrates the improvement in coverage by using more than one search engine. It should not be viewed as a benchmark comparison of search engines, as the performance shown is dependant on arbitrary score filter values. More important is the low error and high sensitivity when using a sequence tag hybrid approach (PEAKS) and a pure peptide fragment fingerprinting approach (like SEQUEST or MASCOT) together -- regardless of score!
|
|   |
Clark Chen, John Morey, Iain Rogers, Filtering out MS/MS spectra of insufficient quality before database searching. ASMS 2006 poster MP329 [download 226.125 Kb] In studying proteins using liquid chromatography coupled tandem mass spectrometry (LC-MS/MS), researchers are often faced with very large data sets. Since each data set may contain thousands of spectra, a manual inspection of each one becomes impossible. Confounding the problem, electrical noise, poor detection and contaminants scanned by the MS mean that only a small portion of these data are quality MS/MS spectra representing peptides. The following presents a method of filtering out the poor quality spectra prior to de novo sequencing or database searching for protein identification. Database search engines and de novo sequencing tools are adequate in discarding the bad spectra; nevertheless, false positives abound, and plenty of time is wasted analyzing nothing.
|
|   |
Bin Ma, Gilles Lajoie. Improved positional confidence score in MS/MS peptide de novo sequencing. ASMS 2006 poster MP348. [download 139.391 Kb] De novo sequencing from MS/MS data is used widely for peptide and protein identification. However, due to the imperfections of the data and/or software, the results are not always reliable. Very often, only partially correct sequences can be obtained by de novo sequencing. If the correct portions of the sequences are known, they can be used as sequence tags to identify the proteins through a homology search. It is therefore very useful for de novo sequencing software to give a positional confidence for each individual amino acid in the peptide it computes from the MS/MS data. We describe here a new method to perform this task.
|
|   |
Clark Chen, Iain Rogers. Intact Peptide Charge Determination from Ion Trap MS/MS. ASMS 2006 poster MP327. [download 744.094 Kb]
In identifying proteins using tandem mass spectrometry, researchers can match measured masses of peptides, and fragments of peptides, to theoretical masses calculated from a protein sequence database. Because a mass spectrometer measures mass-to-charge ratio (m/z), the peptide’s charge (z) must be known to determine the mass used for database searching.
When using an ion trap however, a peptide’s charge state is often difficult to determine by the usual method: examination of the initial MS survey scan of a peptide. It has become common practice, then, to allow a database search engine to determine the charge on a peptide by choosing the charge that allows the best match to the database. This is a poor practice since, instead of inferring results from the data, we are determining what data will best fit the results.
|
|   |
Yang, W., Chen, W., Rogers, I., Ma, B., Bendall, S., Lajoie, G., Smith, D. PEAKS Q: Software for MS-based quantification of stable isotope labeled peptides. (Bioinformatics Solutions Inc., Genome BC Proteomics Centre, University of Western Ontario) ASMS 2006 poster WP531. [download 725.654 Kb] Several mass spectrometry-based stable isotope labeling technologies have been developed for global proteome profiling. These include methods for in vivo labeling, such as 14N/15N and SILAC (Stable Isotope Labeling with Amino Acids in Cell Culture), and in vitro isotope labeling of target peptides at their N/C terminal or at specific residues. In this work we describe a new software, PEAKS-Q, designed to automatically identify and quantify proteins from these isotope labeling experiments. The software is written in Java and includes an intuitive graphical user interface.
|
|   |
Y. Han, B. Ma, and K. Zhang: SPIDER: Software for Protein Identification from Sequence Tags Containing De Novo Sequencing Error. Journal of Bioinformatics and Computational Bioliogy 3(3):697-716. 2005. [download 145.316 Kb] In order to identify the protein by searching the de novo sequencing results in a protein database, the database search software must handle the mass gaps and the de novo sequencing errors. Accounting the de novo sequencing errors and the mass gaps, we developed a software system, SPIDER (Software Protein Identifier), for the rapid identification of proteins that contain peptides best matching the given tags. SPIDER is different and superior to the MS Blast system (Altschul et al.) as the latter does not account for the de novo sequencing errors and mass gaps.
|
|   |
Y. Han, B. Ma, and K. Zhang: SPIDER: Software for Protein Identification from Sequence Tags Containing De Novo Sequencing Error. Journal of Bioinformatics and Computational Bioliogy 3(3):697-716. 2005. [download 148.775 Kb] In order to identify the protein by searching the de novo sequencing results in a protein database, the database search software must handle the mass gaps and the de novo sequencing errors. Accounting the de novo sequencing errors and the mass gaps, we developed a software system, SPIDER (Software Protein Identifier), for the rapid identification of proteins that contain peptides best matching the given tags. SPIDER is different and superior to the MS Blast system (Altschul et al.) as the latter does not account for the de novo sequencing errors and mass gaps.
|
|   |
Iain Rogers. Assessment of an Amalgamative Approach to Protein Identification. ASMS 2005. [download 188.04 Kb] When studying proteins using mass spectrometry, researchers can identify which proteins are in a sample by matching measured masses to the calculated masses of peptides and sequence tags in a protein sequence database. Because of large databases and experimental data sets, this process is necessarily automated using protein identification software. However, because of instrumental and experimental limitations, analysis is made difficult by noise, contamination and inconclusive data. The problem becomes one of validation. The researcher must accept the software’s suggestion and scoring scheme, or spend countless hours manually validating the results. Conclusions based on imperfect data, processed by imperfect software and inferred from non-validated results will always be suspect.
|
|   |
Bin Ma; Gilles Lajoie (Departments of Computer Science and Biochemistry at the University of Western Ontario). Improving the de novo Sequencing Accuracy by Combining Two Independent Scoring Functions in PEAKS Software. ASMS 2005. [download 221.184 Kb]
De novo sequencing from MS/MS data is a standard method for peptide sequencing that does not require the sequences to be in a database, and therefore is best for novel proteins. De novo sequencing is also better at finding PTMs. In addition, when homologues of the novel proteins are in the database, they can be found by sequence homology search after de novo sequencing. Even for proteins in a database, if de novo sequencing computes the correct sequence without looking at the database, the confidence is much higher than simply finding the sequence from the database.
A de novo sequencing program typically requires a scoring function that evaluates the fitness between a peptide sequence and the spectrum. The choice of scoring functions affects the program’s accuracy significantly. In this abstract we demonstrate that better accuracy can be achieved by combining two independent scoring functions.
|
|   |
Jennifer Locke, Jason Rogalski, Lei Guo, Bin Ma, Juergen Kast, Gilles Lajoie (University of British Columbia, Bioinformatics Solutions Inc. & University of Western Ontario). Automated de novo Sequencing Using ToF-ToF MS/MS Data. ASMS 2005. [download 223.744 Kb] Peptide de novo sequencing using tandem mass spectrometry data is a standard approach in proteomics. However, most of the automated de novo sequencing software including the software provided by the mass spectrometer manufacturers, is designed for de novo sequencing using Q-ToF and ion-trap generated data. Due to the different peptide fragmentation in ToF-ToF instruments, it was uncertain whether or not the de novo sequencing software PEAKS, using parameters optimized for other instruments, will work for ToF-ToF data. In this poster we demonstrate that the Q-ToF based internal parameters of the PEAKS software work very well for both de novo sequencing and protein identification even on the average quality of ToF-ToF MS/MS data.
|
|   |
Bin Ma, Amanda Doherty-Kirby, Aaron Booy, Bob Olafson, Gilles Lajoie. A Comprehensive Comparison of the de novo Sequencing Accuracies of PEAKS and Other Software. ASMS Poster, 2004. [download 33.834 Kb] We compared three commonly used de novo sequencing programs, PEAKS, BioAnalyst and PLGS. The result showed that PEAKS has the best accuracy.
|
|   |
Iain Rogers, Christopher Hendrie, Ming Li. Protein ID: Comparing De Novo Based and Database Search Methods. ASMS Poster, 2004 [download 3504.529 Kb] Using the correct for the job is as important in proteomics as it is in any other discipline. When identifying proteins from MS/MS data there are a number of tools to choose from. In the case where the data comes from a well studied organism, the researcher may choose a standard database search tool. In the case where the results from a database search are questionable, some validation is necessary. In a situation where no database program turns up a hit, the researcher must rely on de novo sequencing – be it manually or using automatic de novo software. PEAKS is a powerful and intuitive software package, combining remarkably accurate de novo sequencing with a new approach to protein identification. In this poster we prove PEAKS’ new method is able to identify proteins just as well as standard database search software. In this light, we compare Mascot and PEAKS. Further, we show PEAKS to be the validation tool for standard database search software. Finally, and perhaps most importantly, we show PEAKS to be the best automatics de novo sequencing software.
|
|   |
Chengzhi Liang, Jeffrey C. Smith, Christopher Hendrie, Ming Li, K. W. Michael Siu. A Comparative Study of Peptide Sequencing Software Tools for MS/MS. ASMS Poster, 2003. [download 64.155 Kb] A current bottleneck in proteomics is automated and accurate sequencing of enzymatically cleaved peptides. It is estimated that over two thirds of the MS/MS spectra produced by high end quadrupole-TOF and TOF-TOF instruments in proteomics-research based corporations do not provide useful information [1]. An important contributing factor in this is the lack of high-quality software. The software currently available for MS/MS peptide sequencing mainly falls into two categories: (1) database searching by assigning a peptide sequence based on scoring against a protein (or peptide) database; and (2) de novo sequencing by deriving a (partial) sequence directly from an MS/MS spectrum. This study compares several programs representative of these two categories.
|
|   |
Bin Ma, Kaizhong Zhang, Christopher Hendrie, Chengzhi Liang, Ming Li, Amanda Doherty-Kirby, Gilles Lajoie. PEAKS: Powerful Software for Peptide De Novo Sequencing by MS/MS. Rapid Communications in Mass Spectrometry, 17(20):2337-2342. 2003. Early version appeared in 50th ASMS Conference 2002. [download 131.379 Kb] [download 131.379 Kb]
If you plan to cite PEAKS in your research, please refer to this paper. PEAKS has come a long way since the original version, but the principles are the same.
A number of different approaches have been described to identify proteins from tandem mass spectrometry (MS/MS) data. The most common approaches rely on the available databases to match experimental MS/MS data. These methods suffer from several drawbacks and cannot be used for the identification of proteins from unknown genomes. In this communication, we describe a new de novo sequencing software package, PEAKS, to extract amino acid sequence information without the use of databases. PEAKS uses a new model and a new algorithm to efficiently compute the best peptide sequence whose fragment ions can best interpret the peaks in the MS/MS spectrum. The output of the software gives amino acid sequences with confidence scores for the entire sequence as well as an additional novel positional scoring scheme for portions of the sequence. The performance of PEAKS is compared with Lutefisk, a well known de novo sequencing software, using quadrupole-time-of-flight (Q-TOF) data obtained for several tryptic peptides from standard proteins.
|
 
|
 |