Automated Sample Preparation
An Overview for the Proteomics Laboratory
Automation - Genomes of many species, including that of humans have been decoded with modern genomic methods such as „second generation sequencing". Transcriptomics allows the determination of the gene transcription of thousands of genes simultaneously, while proteomics determines the expression of proteins in large scale. In contrast to the genome, both the proteome and the transcriptome may change at certain times or for certain tissues. The global analysis of complex protein samples, i.e. decoding of the proteome is more difficult than the sequencing of a genome or the measurement of a transcriptome.
In essence, there are two reasons for this problem: firstly, the possible translation of multiple proteins from a transcript and, secondly, the dynamic range of the mass spectrometer which is used for shotgun analysis of the proteome. Theoretically, a protein is transcribed from one gene, however, alternative splicing as well as variable initiation starts may result in different protein variants. In addition, post-translational modifications on amino acid residues, such as phosphorylation, methylation or ubiquitination, may occur at various positions. Some proteins like ribosomal or proteasomal subunits or histones are present in large quantities, whereas many candidates that could be of greater interest, like transcription factors, are only present in small copy numbers. The range of highly abundant to low-copy proteins can be around 10 orders of magnitude, whereas the dynamic range of modern mass spectrometers only covers about three to five orders of magnitude.
At present, the greatest challenge is to reduce the complexity of the samples in order to detect interesting proteins which are only expressed in small quantities. This can be achieved by various chromatography and enrichment methods. Cutting-edge mass-spectrometry-based proteomics research is currently peptide-based. Therefore proteins have to be converted into peptides before the mass-spectrometric analysis, which is designed to identify as many peptides as possible. In a bioinformatics analysis the peptides are reassembled into proteins (buttom-up principle).
Mass-spectrometers directly coupled to a nano-HPLC System (LC-MS/MS) are widely used and enable the detection of up to 50,000 peptides for each analysis . With the present state-of-the-art machines it is possible to identify up to 8,000 proteins in a sample within a few hours of measurement [2,3].
Efficient digestion of proteins is a main prerequisite for such deep analysis. Typically, digestion is performed with trypsin, a specific endopeptidase that cuts proteins after the amino acids lysine and arginine. It is particularly important to achieve complete and reproducible digestion. The conditions for a protease digestion can vary and depend on the buffer composition, duration and temperature .
A typical proteomics workflow is as follows: 1. sample preparation, 2. LC-MS, 3. data analysis. All three parts can be automated. Sample preparation can be subdivided into the preparation of cell lysates, digestion of the proteins into peptides and, eventually, fractionation. In this article we focus on automation of the protease digestion, as one of the keystones in bottom-up proteomics applications.
Protein Digestion in Solution
Tryptic digestion of proteins in solution can be performed under native or denaturing conditions. Protocols based on the denaturing principle are usually more efficient. Under these conditions, all structures, with the exception of disulfide bonds are lost. In the first step, the cysteine bonds are broken by reduction (Fig. 1A) and followed by alkylation, to prevent reformation of the bond. Some enzymes, such as LysC endopeptidase are active under denaturing conditions and can be used for pre-digestion. LysC only cuts after the amino acid lysine and produces shorter, non- functional fragments of the proteins in the protein mixture. For the tryptic digestion, the mixture is diluted, as trypsin is inactive under highly denatured conditions. Due to the partial pre-digestion, proteins can no longer fold back and are therefore more accessible for the trypsin.
Protease Digestion in Acryl-Amide Gels or on Membranes
For mass-spectroscopy-based proteomics it is essential that no interfering substances are present in the samples, as these can impair the measurement drastically. However, many biological samples contain detergents or other substances which require elimination from the sample. For that reason, polyacrylamide gels (PAGE) or FASP filters  have been successfully used. For samples separated with PAGE, the individual gel bands are cut out from the gel and subsequently washed with aqueous or organic solvents using a large number of washing steps. Organic solvents such as ethanol or acetonitrile are used to dry out the gel bands and aqueous buffers are utilized to rehydrate them (Fig. 1B). Repetitions of dehydration and rehydration steps enhance diffusion into the gel and thereby allow more efficient washing. Afterwards, reduction and alkylation of the proteins in the gel is carried out with the appropriate reagents as described for in-solution digestion. The digested peptides are extracted from the gel matrix by using organic solvents such as acetonitrile.
Automation of in-solution protease digestion, in essence, requires a robot which is able to transfer liquids without causing contamination. Contamination is prevented by changing the tips on the transfer unit or by extensive washing steps. Suitable robots for this purpose are the Freedom Evo (Tecan), the Progest Automat (Digilab) and the PALs (CTC), whereby throughput and flexibility are often mutually exclusive and correlate with the purchase price. The purchase of a Tecan device is advisable for a high throughput process, whereas the much cheaper CTC PALs are a more flexible, but a slower platform. Sample digestion with an Agilent 1100 or 1200 HPLC system coupled online to the MS has been described . This system can be used if the digestion, peptide purification and subsequent workflow are to be automated in sequence, as with the Assaymap Bravo (Agilent) . After digestion, those systems are able to purify the peptides via reverse-phase and then to perform efficient fractionation via a strong cation exchanger. Figure 2 shows a scheme for such an in-solution setup.
Unlike gel bands from 2D electrophoresis, samples from one-dimensional polyacrylamide-based gels are usually much more complex and the excised gel bands are larger. In-gel digestion separates interfering substances well, but requires a large number of washing stages. In addition, the gel bands must be cut out. Because of this, automation is highly desirable. Equipment has been developed for cutting out the bands, primarily for cutting out spots for 2D gel electrophoresis in combination with MALDI analysis. Automated gel-band cutting also has the advantage that typical contaminations such as keratin, which is introduced via human skin and hair, are eliminated.
After the excised gel bands have to be washed, the disulfide bridges of the proteins are reduced and alkylated and, ultimately, the peptides subjected to enzymatic digestion. A vacuum chamber setup has been developed for this purpose . Gel bands are placed in the wells of a 96 well plate, which is equipped with a PVDF filter at the base. The filter allows aspiration of liquid from the wells. Comparing the manual and the automated procedure, comparable sequence coverage was achieved for the recombinant proteins used in that study. In order to test more complex proteins, an automated iTRAQ labeling approach has been used, which followed in-gel digestion and demonstrates that the quantification of more complex proteins is possible with the automated protocol. This increased reproducibility in two aspects: reduction of the potential variability in protein digestion efficiency and increased labeling efficiency. Figure 3 shows a possible robotic setup for in-gel digestion using a PAL-based system.
Summary and outlook
The LC-MS-based proteomic field relies on the efficient and reproducible digestion of protein samples to peptides. The sample throughput must be maximized and sample-handling errors reduced. These requirements can be achieved by various automation systems. Some of these automated systems are able to supplement the workflow by digestion with peptide purification and fractionation. In future, the existing workflows will need to be optimized and adapted for specific requirements and implemented for further applications such as labeling techniques.
 Hebert A. S. et al.: Mol. Cell Proteomics 13, 339-347 (2014).
 Mann M. M. et al.: Molecular Cell 49, 583-590 (2013).
 Meissner F. and Mann, M. Nature Publishing Group 15, 112-117 (2014).
 Lowenthal M. S. et al.: Anal. Chem. 86, 551-558 (2014)
 Wisniewski J. R. et al.: Proteome Res. 8, 5674-5678 (2009)
 Richardson J. et al.: Analytical Biochemistry 411, 8-8 (2011)
 Russell J. D. et al.: Agilent Technologies, Inc. (2014).
 Schmidt C. and Urlaub H.: Methods in Molecular Biology 564, 207-226 (Humana Press, 2009)