|
|
||||||||
1 Arizona State University, Tempe, Arizona;
2 University of Maryland Baltimore, School of Medicine, Baltimore, Maryland,
3 University of Texas MD Anderson Cancer Center, Houston, Texas;
4 Pfizer Inc., St. Louis, Missouri;
5 University of Texas, Austin, Texas;
6 Emory University, Atlanta, Georgia;
7 Monsanto Co., St. Louis, Missouri;
8 University of Florida, Gainesville, Florida
Correspondence: Daniel C. Brune, Dept. of Chemistry and Biochemistry, Arizona State University, Tempe, AZ 85287-1604 (phone: 480-965-0795; fax: 480-965-2747; email: dbrune{at}asu.edu).
| ABSTRACT |
|---|
|
|
|---|
-acetyl lysine [Lys(Ac)], present within one of the peptides. The ESRG 2006 peptide mixture consisted of three synthetic peptides. The Peptide Standards Research Group (PSRG) provided two peptides, with the following sequences: KAQYARSVLLEKDAEPDILELATGYR (peptide B), and RQAKVLLYSGR (peptide C). The third peptide, peptide C*, synthesized and characterized by ESRG, was identical to peptide C but with acetyl lysine in position 4. The mixture consisted of 20% peptide B and 40% each of peptide C and its acetylated form, peptide C*. Participating laboratories were provided with two tubes, each containing 100 picomoles of the peptide mixture (as determined by quantitative amino acid analysis) and were asked to provide amino acid assignments, peak areas, retention times at each cycle, as well as initial and repetitive yield estimates for each peptide in the mixture. Details about instruments and parameters used in the analysis were also collected. Participants in the study with access to a mass spectrometer (MALDI-TOF or ESI) were asked to provide information about the relative peak areas of the peptides in the mixture as a comparison with the peptide quantitation results from Edman sequencing. Positive amino acid assignments were 88% correct for peptide C and 93% correct for peptide B. The absolute initial sequencing yields were an average of 67% for peptide (C+C*) and 65.6 % for peptide B. The relative molar ratios determined by Edman sequencing were an average of 4.27 (expected ratio of 4) for peptides (C+C*)/B, and 1.49 for peptide C*/C (expected ratio of 1); the seemingly high 49% error in quantification of Lys(Ac) in peptide C* can be attributed to commercial unavailability of its PTH standard. These values compare very favorably with the values obtained by mass spectrometry.
Key Words: Edman sequencing phenylthiohydantoin (PTH) amino acid polypeptide quantitation quantitative analysis
To evaluate quantitative aspects of Edman sequencing in contemporary instruments, we provided study participants with a sample containing three well-characterized polypeptides. Two of the polypeptides were provided by the ABRF Peptide Standards Research Group. The third was a homologue of one of these peptides. The homologous peptide was synthesized by a member of the ESRG and differed in a single amino acid modification. The homologous peptide was quantified from its absorbance at 280 nm and by amino acid analysis. Including this homologous peptide gave participants the additional challenge of identifying the modified amino acid. Participants were asked to provide sequencing data used to determine both the absolute and relative amounts of the different polypeptides in the mixture.
The ESRG 2006 study is the eighteenth in a series on Edman sequencing conducted for the ABRF by the Edman Sequencing research group. The objectives and results of the 17 previous studies are summarized in Table 1
(1–17).
|
| MATERIALS AND METHODS |
|---|
|
|
|---|
The peptides used in this study were therefore as follows:
Peptide B (20 pmol): KAQYARSVLLEKDAEPDILELATGYR
Peptide C (40 pmol): RQAKVLLYSGR
Peptide C* (40 pmol): RQAK(
-acetyl)VLLYSGR
Samples for distribution to study participants were prepared as follows. Peptide standards B and C (1 mg) were dissolved separately in 5 mL of 30% acetonitrile in water containing 10 mM trifluoroacetic acid (TFA), resulting in stock solutions containing 67.8 µM Peptide B and 155 µM Peptide C. The final peptide mixture was then prepared by adding 15.25 µL C*, 59 µL B, and 51.6 µL C to 1874 µL of 30% acetonitrile in water with 10 mM TFA to give a stock solution containing 4.0 µM C*, 4.0 µM C, and 2.0 µM B. Ten-microliter samples of this mixture were placed in 0.6-mL Eppendorf tubes and dried in a Speed-Vac. These sample tubes, each containing 40 pmol each of peptides C and C* and 20 pmol of peptide B, were stored at –20°C until mailing to study participants.
Sample Distribution
The ESRG announced the 2006 study by e-mail via the ABRF discussion board, as well as on the main ABRF Web page under "Open Research Studies" and on the ESRG Web page. A total of 34 requests for samples were received. Each person requesting samples received duplicate tubes containing the peptide mixture via regular mail.
Sample Analysis
Preliminary Edman degradation analysis by members of the ESRG showed that the sample contained peptides with the expected sequences in approximately the expected amounts. Data from the ESRG are included in the summary of quantitative results from sequencing this sample.
An amino acid analysis was performed on this peptide mixture by a member of the ESRG to confirm that its amino acid composition corresponded quantitatively to the amounts of the peptides that were added. For this measurement, 100 µL of the dried sample was dissolved in 20% acetonitrile containing 0.1% TFA and transferred to a hydrolysis tube. This sample was taken to dryness and subjected to vapor-phase acid hydrolysis—6 N HCl containing 1% (v/v) phenol—at 150°C for 90 min. The resulting amino acids were separated over an ion-exchange HPLC column and subjected to post-column ninhydrin derivatization using a Hitachi L-8800 amino acid analyzer. Amino acids in 0.1 mol/L hydrochloric acid (NIST Standard Reference Material 2389) were used to calibrate each amino acid peak area and to show chromatographic reproducibility. Bovine serum albumin (7% solution) (NIST Standard Reference Material 927c) control was used to verify that a proper hydrolysis had occurred. An internal standard, norvaline (Sigma Cat. No. N7627), was added to account for injection-to-injection variability on the analyzer. This analysis showed an average recovery of 92.5% for the amino acids present in this mixture (Table 2
).
|
Data Reporting
The sequencing data were reported electronically in an anonymously submitted Excel spreadsheet. The spreadsheet included cells for reporting the fraction of the sample sequenced, the amino acids observed on each cycle of sequencing, the retention time and peak area for each amino acid, as well as the peak areas, retention times, and picomolar amounts of a set of PTH-AA standards. Three separate areas were included for arranging the amino acids in the three peptides in order according to the peptide sequences. Laboratories were asked to report known amino acids using the common three-letter amino acid code, to report "X" for unidentified modified amino acids, and to report "–" for no observed amino acid peaks for each cycle. They were asked to indicate their confidence level in the call by placing parentheses around tentative calls. The spreadsheet also included entry slots for the mass spectrometry data, including masses and peak areas for the three peptides and information about the type of instrument and analysis mode.
In order to facilitate accurate quantitation, participants were encouraged to use freshly prepared standards of the best quality available for this analysis. They were also encouraged, but not required, to calculate and report the relative and absolute amounts of the peptides in the mixture. In order to ensure that quantitative determinations were done in a uniform manner, the relative and absolute amounts of the peptides were also calculated by the ESRG using the data provided.
Finally, an instrument and analytical conditions survey was included on the spreadsheet to determine how the participating laboratories conduct Edman degradation. Each laboratory was asked to provide information on their instruments, including HPLC gradient conditions, buffers and solvents, chemistry cycles, and other parameters that could affect the results of the study.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
Instrumentation used by members of the ESRG was generally similar. Although ESRG data are excluded from the part of the study dealing with sequencing accuracy, these data are included in the study on peptide quantitation. ESRG member facilities providing quantitative data used the following instruments (included in Table 4![]()
below): 1. Porton 2090e, 2. ABI Procise 494HT, 3. ABI 494 HT, 4. ABI Procise cLC, and 5. ABI 494 HT.
|
|
|
|
![]() |
Thus picomolar quantities were obtained by dividing reported peak areas from the appropriate cycles by the peak area per pmol of the corresponding amino acid standard.
Because only one facility (No. 110) actually ran a PTH standard for Lys(Ac), an approximate value for the picomolar quantity of this residue for the other facilities was calculated by assuming that its peak area/pmol was 1.11 times the average area/pmol for Ala and Tyr. This assumption is based on the fact that the observed Lys(Ac) peak area was found to average 1.11 times the area whose log lay on the trendline of a plot of log area vs. sequencing cycle using logs of regularly spaced Ala and Tyr residues in the ESRG 2004 study.16 Although different sequencers in that study gave different ratios between the Lys(Ac) peak area and the observed area on the sequencing cycle where that residue occurred, the factor of 1.11 was the average for both HT and cLC sequencers from ABI. Lys(Ac) values obtained in this way are included in Table 4A
.
Initial and repetitive yields were determined using the Excel trendline function to plot logs of the picomolar amounts of Ala, Val, Leu, and Tyr (residues 3, 5, 6, 7, and 8 in peptides C+C*, and residues 2, 4, 5, 8, 9, and 10 in peptide B (cells containing these values are shaded in Table 4![]()
), as a function of the sequencing cycle. These residues were selected as being likely to give stable PTH derivatives with well-resolved peaks during analysis by HPLC. The anti-log of the y-intercept of the log plot is the initial yield, and thus the picomolar amount theoretically present in the sample loaded. This corresponds to the yield from sequencing cycle 0, and eliminates declines in yield due to lag and other repetitive losses.18 The antilog of the slope of the log plot is the repetitive yield. Figure 2
shows an example of how these values were calculated.
|
Most of the repetitive yield values were in the range between 90% and 98%, which is reasonable for contemporary sequencers performing automated Edman degradation. A few values lay outside this range. In one instance (Facility 110) a low repetitive yield (76%) was primarily due to an anomalously high peak area for the Tyr standard (residue 8), resulting in a low yield for Tyr relative to other amino acids in the sequence. Leaving out this residue in the calculations (for facility 110) changes the initial and repetitive yields to more reasonable values of 92.5 pmol and 84%, respectively. This shows the potentially large effect of an error in the calculated amount of one amino acid when several different amino acids are used for repetitive and initial yield calculations.
One way to avoid problems of this sort is to base repetitive yields on a single type of amino acid that occurs repeatedly throughout the sequence. This procedure has been used in several previous ESRG studies, e.g., refs. 16, 17) and was also used in the careful analysis of repetitive yields by Smithies et al.18 It also is commonly used to check sequencer performance with β-lactoglobulin, which has well-spaced pairs of Leu, Ile, and Val residues, as a standard. We could not use this procedure in the study partially because we chose to use the well-characterized peptide standards prepared by the ABRF Peptide Standards Research Group and because we wanted our results to apply to actual samples that might not contain a regularly recurring amino acid residue. As a result, this study represents a fairly severe test of the accuracy of Edman degradation in obtaining absolute and relative concentrations of polypeptides in a mixture.
Absolute Amounts of the Peptides in the Mixture
Absolute amounts of peptides C+C* and B in the samples provided were calculated by dividing the initial yields by the fraction of the peptide loaded, and are also shown in Tables 4A
and 4B
. Figure 3
shows the picomolar amounts of each of the peptides determined from the data of participating facilities in graphic form. As shown in the figure, the absolute picomolar amounts of the peptides determined from data provided by the participating facilities were, on average, about two-thirds of the actual amounts supplied in the samples, and, with a few outliers, the absolute amounts determined tended to cluster fairly closely around the average values. Taking into account the results from an amino acid analysis of the peptide mixture (Table 2
) indicating that the recoverable amount of the peptides in the mixture may have been only 92.5 % of the expected amount, improves the absolute sequencing yield from about 66.3% (the average between the 67.0% yield for peptides C+C* and 65.6% for peptide B) to 66.3/92.5, or 71%. This suggests a loss of about 30% for reasons such as sample washout, partial inaccessibility of the N-terminal residue to sequencing reagents, or incomplete extraction and transfer of the cleaved amino acid to the conversion flask. Examining the data used to calculate the initial yields did not reveal any correlation with either the fraction of the sample loaded into the sequencer or the R2 value of the plot used to determine repetitive and initial yields.
|
There is no direct way to calculate the picomolar amounts of the individual peptides C and C* from initial yields because these peptides differed only at position 4. However, picomolar amounts of the individual peptides can be obtained by determining the ratio between unmodified Lys and Lys(Ac) on sequencing cycle 4. Picomoles of Lys(Ac) were calculated as described above by assuming that its peak area/pmol was 1.11 times the average area/pmol for Ala and Tyr. Picomolar amounts of Lys on cycles 1, 4, and 12 were calculated from the Lys area/pmol obtained from the data on standards supplied by each facility.
The phenylthiohydantoin derivative of Lys (PTH-Lys) is known to be less stable than other PTH amino acids, particularly in the presence of peroxide impurities in the sequencing reagents or chromatography solvents.19 Possibly because of this, the pmol amounts of Lys reported for peptide B on cycles 1 and 12 were frequently below the trendline values for cycles 1 and 12 on the log plot used to determine initial and repetitive yields for peptide B. The log of the trendline amount (in pmol) of an amino acid on sequencing cycle x can be calculated from the following equation
![]() |
where AAx is the amino acid released on sequencing cycle x, I.Y. is the initial yield in picomoles, and R.Y. is the repetitive yield. (See Figure 2
for an illustration of how initial and repetitive yields are calculated from the sequencing data.) Therefore, a Lys correction factor for each facility was calculated by averaging the numbers by which the Lys 1 and Lys 12 amounts in peptide B needed to be multiplied to give values whose logs lay on the trendline. The picomolar amount of Lys 4 was multiplied by this correction factor, and the picomolar amount of Lys(Ac) was then divided by the corrected Lys 4 value to obtain a corrected peptide C*/C value. Table 5
shows the picomolar amounts of Lys at position 4 in the sequence of the peptide mixture, as well as the peptide C*/C ratio, calculated both from the peak area of Lys on cycle 4 relative to the Lys standard (C4 Lys obs) and from the corrected value obtained after multiplying by the correction factor as described above (C4 Lys corr). As shown in the table, use of the correction factor seems to be justified in that standard deviation of the values for the C*/C ratio decreased from 2.59 to 0.56. Applying this correction factor caused the average C*/C ratio to decrease from 2.37 to 1.49, which is somewhat greater than the expected ratio of 1.0.
|
![]() |
The absolute picomolar amount of peptide C* is then simply the amount of peptide C times the C*/C ratio.
Relative Yields
Relative yields of peptides (C+C*)/B were calculated from the initial yield data shown in Table 5
and are summarized graphically in Figure 4A
. As shown here, the average peptide (C+C*)/B ratio found experimentally was 4.27, which corresponds to an average difference of only 6.8% from the expected ratio of 4.0. Twenty of the 23 laboratories analyzing the sample obtained ratios between 3.0 and 5.0. This result indicates that Edman sequencing is a reliable method for determining ratios among peptides in a mixture. Comparing Figures 3
and 4
reveals that there is less scatter in the relative yield than in the initial yield data. Thus factors causing errors in initial yield determinations tended to affect the component peptides of the mixture similarly, so that there was a smaller effect on relative than absolute yields.
|
Three of the participating facilities (220, 500, and 800), reported relative amounts of peptides C and C*, and their reported values are also included in Table 5
. Generally, their reported values are in good agreement with those calculated by the ESRG. Facility 220 reported a value of 80 picomoles for peptides C* and C combined and a C*/C ratio of 1, thus implying that each was present at the expected value of 40 picomoles. However, this result must be regarded as somewhat fortuitous, since that facility identified residue 4 in C* as trimethyl Lys, instead of Lys(Ac).
Mass Spectrometry Data on the Peptide Mixture
Participants with access to mass spectrometry equipment were encouraged to report masses and peak areas obtained for the peptides using this technique. Fifteen of the 18 facilities participating in this study were able to obtain masses for the three peptides used in this study. As shown in Table 6
, most of those 15 participants obtained correct masses for the three peptides. Reported masses for peptides C and C* ranged from 1289.70 to 1291.57 and 1331.40 to 1333.55, respectively, while masses for peptide B ranged from 2946.01 to 2955.86, with one outlier at 3107.30. Most of the differences between reported mass values are undoubtedly due to differences in whether the reported value was that of the uncharged peptide or the protonated peptide ion, or of the monoisotopic or average mass. The more extreme values given for peptide B imply imprecise calibration, or, in the case of the 3107.30-Da outlier, either an adduct that was specific for that peptide (C and C* masses reported by the same facility were correct) or an impurity introduced in sample handling.
|
Several of the participants providing mass spectrometry data may have reported only the areas of the monoisotopic peaks, while others reported the total area for the entire envelope of isotopic peaks for each peptide. Participants were not asked to specify whether the reported areas were those of all of the peaks or only of the monoisotopic peak. This can make a significant difference. Calculations show that for peptide C, the smallest peptide in the mixture, the monoisotopic peak accounts for 47.5% of the total peak area, while for peptide B, the monoisotopic peak accounts for only 18.3%. The effect of this difference in reporting peak areas is illustrated by the MALDI-TOF data from the ESRG member labs, where two group members (facilities 1 and 2) reported the total area of all of the isotopic peaks for each peptide while two others (facilities 3 and 4) reported only monoisotopic peak areas. In the former case, the area due to peptide B accounted for 14±3% of the total peptide peak area, while in the latter case, peptide B accounted for only 4±1% of the total area.
The raw peak areas from mass spectrometry show less correlation with the relative amounts of these peptides than do the Edman sequencing data. Mass spectrometry is capable of yielding reasonably accurate quantitative data on polypeptides in a mixture, but only when the response of the mass spectrometer to that peptide is relative to an internal standard of known concentration.20,21 An inherent advantage of Edman sequencing for polypeptide quantitation is that quantitation is based on the uniform response of the online HPLC system (UV absorbance) to the individual PTH amino acids.
| CONCLUSIONS |
|---|
|
|
|---|
The absolute picomolar amounts determined were lower than expected by about 30%. This implies losses due to reaction inefficiencies or side reactions in the sequencer reaction cartridge that specifically affected the first sequencing cycle, and thus the initial yield, and/or due to inefficiencies during transfer of the ATZ–amino acid derivatives to the conversion flask. Because repetitive yields were typically in the 90 to 98% range, these losses cannot be attributed to generally low yields for the overall coupling and cleavage reactions. However, it may be possible that coupling and cleavage steps on the first cycle could be affected by sample impurities that do not affect subsequent cycles, as the sample becomes cleaner due to washing and extraction steps as sequencing progresses. Other factors that could specifically lower the yield on the first cycle might include adsorption of the peptide on the support in such a way that a fraction of the peptide becomes permanently inaccessible to the sequencing reagents, or poor binding of a portion of the peptide to the support, resulting in washout. In an earlier study on repetitive and initial yields of proteins adsorbed on three different types of sample supports, Lavin et al.22 found that initial yields were different on different types of supports, and speculated that the differences were due to differing degrees of sample washout. A consistent failure to transfer 100% of the cleaved ATZ–amino acid to the conversion flask, either because some of the cleaved amino acid remains adsorbed on the support after this transfer, or because some of it cleaves prematurely during coupling and is lost in washing steps prior to transfer to the conversion flask, would also contribute to these losses. Without further experimental data, it is not possible to assess the contributions of these factors, and possibly others not yet considered, to the lower than expected initial yield.
The relative amounts of peptides in the mixture, determined from their initial yields, were highly accurate, the average error being 6.8%. Determining the relative amounts of Lys and Lys(Ac) in the two homologous peptides proved more difficult, differing from the expected value by about 50%. This larger error is attributed mainly to the fact that a synthetic PTH standard for Lys(Ac) was not available to the participants. Analysis of the peptide mixture by mass spectrometry yielded accurate masses for the different components of the mixture; however, the relative peak areas from the peptides in the mass spectra did not accurately reflect the relative amounts of the peptides in the mixture.
| ACKNOWLEDGMENTS |
|---|
| REFERENCES |
|---|
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |