Characterizing combinations of coding polymorphisms (cSNPs), alternative splicing and post-translational modifications (PTMs) on a single protein by standard peptide-based proteomics is challenging owing to <100% sequence coverage and the uncoupling effect of proteolysis on such variations > 10-20 residues apart. Because top down MS measures the whole protein, combinations of all the variations affecting primary sequence can be detected as they occur in combination. The protein form generated by all types of variation is here termed the "proteorype", akin to a haplorype at the DNA level. Analysis of proteins from human primary leukocytes harvested from leukoreduction filters using a dual on-line/ off-line top down MS strategy produced >600 unique intact masses, 133 of which were identified from 67 unique genes. Utilizing a two-dimensional platform, termed multidimensional protein characterization by automated top down (MudCAT), 108 of the above protein forms were subsequently identified in the absence of MS/MS in 4 days. Additionally, MudCAT enables the quantitation of allele ratios for heterozygotes and PTM occupancies for phosphorylated species. The diversity of the human proteome is embodied in the fact that 32 of the identified proteins harbored cSNPs, PTMs, or were detected as proteolysis products. Among the information were three partially phosphorylated proteins and three proteins heterozygous at known cSNP loci, with evidence for non-1:1 expression ratios obtained for different alleles.
ASJC Scopus subject areas
- Analytical Chemistry