Small open reading frames (smORFs) encode previously unannotated polypeptides or short proteins that regulate translation in cis (eukaryotes) and/or are independently functional (prokaryotes and eukaryotes). Ongoing efforts for complete annotation and functional characterization of smORF-encoded proteins have yielded novel regulators and therapeutic targets. However, because they are excluded from protein databases, initiate at non-AUG start codons, and produce few unique tryptic peptides, unannotated small proteins cannot be detected with standard proteomic methods. Here,, we outline a procedure for mass spectrometry-based detection of translated smORFs in cultured human cells from protein extraction, digestion, and LC-MS/MS, to database preparation and data analysis. Following proteomic detection, translation from a unique smORF may be validated via siRNA-based silencing or overexpression and epitope tagging. This is necessary to unambiguously assign a peptide to a smORF within a specific transcript isoform or genomic locus. Provided that sufficient starting material is available, this workflow can be applied to any cell type/organism and adjusted to study specific (patho)physiological contexts including, but not limited to, development, stress, and disease. © 2019 by John Wiley & Sons, Inc. Basic Protocol 1: Protein extraction, size selection, and trypsin digestion Alternate Protocol 1: In-solution C8 column size selection Support Protocol 1: Chloroform/methanol precipitation Support Protocol 2: Reduction, alkylation, and in-solution protease digestion Support Protocol 3: Peptide de-salting Basic Protocol 2: Two-dimensional LC-MS/MS with ERLIC fractionation Basic Protocol 3: Transcriptomic database construction Alternate Protocol 2: Transcriptomics database generation with gffread Basic Protocol 4: Non-annotated peptide identification from LC-MS/MS data Basic Protocol 5: Validation using isotopically labeled synthetic peptide standards and siRNA Basic Protocol 6: Transcript validation using transient overexpression.
Keywords: genomics; mass spectrometry; microprotein; peptidomics; proteogenomics; proteomics; short open reading frame; small open reading frame; small protein; transcriptomics; upstream open reading frame.
© 2019 John Wiley & Sons, Inc.