Automated protein (re)sequencing with MS/MS and a homologous database yields almost full coverage and accuracy.

Xiaowen Liu1, Yonghua Han2, Denis Yuen3 and Bin Ma1
1David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Canada
2Department of Computer Science, University of Western Ontario, London, Canada
3 Bioinformatics Solutions, Inc., Waterloo, Canada

Software:

Complete Homology-Assisted Ms/ms Protein Sequencing(Champs)
To download, Click here.

Datasets:

There are three files in each data set. The first file is the orignial MS/MS data. The second file is a text file containing the spider tags. In the text file, each spider tag is described by four lines: de novo tag, spider tag, matched reference sequence and similarity score. The third file is a html file containing the alignment of the spider tags and the reference sequence. In the html file, the first line contains numbers indicating the positions of the reference sequence; the second line is the reference sequence; and the following lines are aligned spider tags. For the spider tags, the residues with red backgroud are matched ones; the residues with green background are in blocks with similar mass to the correpsonding residues in the reference sequence; the residues with white background are not matched ones.

1. BSA with reference sequence ALBU_SHEEP: MS/MS data, spider tags and alignment.

2. BSA with reference sequence ALBU_CANFA: MS/MS data, spider tags and alignment.

3. LysC with reference sequence LYSC_COTJA: MS/MS data, spider tags and alignment.