HIGH-RESOLUTION MASS SPECTROMETRY – A Map to Biologics


INTRODUCTION

Biopharmaceutical products represent up to 20% of the total pharmaceutical market and are growing at a rate of nearly 8% annually. To keep up with this impressive growth rate, many CDMOs are making strategic investments in equipment and expertise to support analytical development and structural characterization of biopharmaceuticals in more economical and efficient ways.1 Unlike small molecule active pharmaceutical ingredients that exist as a single chemical entity, biologics nearly always exist as a mixture of molecules. Different molecules can arise from numerous sources including N-terminal variants, post-translation modifications (PTMs), glycoforms, and degradation products. Due to the potentially high amount of heterogeneity in the biologic, it is critical to demonstrate control over the drug substance (DS) manufacturing process from fermentation to purification and protein re-folding. Characterization of the various molecules is a first step in demonstrating process control. High-resolution mass spectrometry is a key component to the characterization of novel biologics and biosimilars.

INTACT MASS ANALYSIS

Intact mass analysis of biologics is an important tool for verifying that the purified drug substance was successfully expressed and purified. Intact mass analysis of monoclonal antibodies (mAbs) verify that the molecule has been assembled correctly and all expected post-translation modifications are present. While this provides a high-level confirmation of protein molecular weight, it cannot provide the location of any modifications that might present. A different tool is required to help elucidate the location(s) of modifications. This site-specific information is required to demonstrate that the entire process from protein expression and purification to drug product set on stability is under control. One such tool is peptide mapping with high-resolution mass spectrometry.

PEPTIDE MAPPING

Methods that target specific impurities that are known to reduce the activity, binding, or efficacy of the biologics are required to fully de-risk the impact of potential impurities generated during the manufacture of biologics. Peptide mapping is one technique that can provide the necessary resolution to target the specific impurities where traditional chromatographic or electrophoretic techniques prove to be difficult. Coupled with high-resolution mass spectrometry, peptide mapping can provide acceptable specificity, repeatability, and accuracy to quantitate known degradation peptides within the digested sample. Peptide mapping generally consists of the following steps:

-Denaturation of the target protein

-Reduction of disulfide bonds

-Alkylation of the free thiol groups within the side chains of cysteine residues to ensure disulfide bonds are not reformed

-Buffer exchange into digestion buffer

-Digestion with an appropriate endoproteinase

-Analysis by HPLC/UV/MS or MS/MS

Glycoproteins can be deglycosylated prior to performing peptide mapping to reduce the heterogeneity introduced by various glycoforms. Identification of peptides generated during the peptide mapping procedure can be performed using advanced mass spectrometry software tools based on intact mass and MS/MS fragmentation patterns of the peptides.

The use of peptide mapping to characterize biologics has the advantage that the protein is divided into a large number of smaller peptides. The generated peptides can then be chromatographically separated using HPLC or UPLC technology. The intact masses of the peptides can provide information regarding the type of modification that was made. MS/MS sequencing of the peptides can provide the exact amino acid residue containing the modification. This level of resolution at the individual amino acid level is simply not possible with mass spectrometric analysis of intact proteins or mAbs. Examples of the use of peptide mapping to characterize and quantitate specific post-translation modifications or degradation products are provided.

N-TERMINAL VARIANTS

Quantitation of N-terminal variants from bacterial (formyl-Met, Met, or another amino acid) expression systems, eukaryotic (ie, pyro-Glu, Gln) expression systems, or synthetic peptides (acetylation) are possible using the correctly chosen endoproteinase for digestion. Chromatographic separation of peptides that differ by a single amino acid is generally easier than separating intact proteins that only differ by a single amino acid. Given that this type of peptide mapping is targeting just one peptide from the entire protein, the chromatography can be optimized for the N-terminal peptide and its variants rather than optimized to separate all peptides generated from the peptide map. Due to the relatively large mass differences between the different N-terminal variants, quantitation can be performed using multiple reaction monitoring (MRM) detection on triple-quadrupole mass spectrometers, allowing for maximum specificity and adequate sensitivity to observe even trace levels of minor components.

DEGRADATION PRODUCTS

Quantitation of degradation products generated during protein purification and re-folding steps or during stability assessment of the drug product, such as methionine oxidation or deamidation of asparagine and glutamine residues, can be more challenging given the potentially larger number of degradation locations and their distribution in the protein. Oxidation of methionine residues is one of the most common degradation products found in biologics. Methionine can be oxidized through oxygen dissolved in the buffer or through the formation of hydroxyl radicals upon exposure to UV light. Deamidation of asparagine residues can occur during protein purification and re-folding steps or during the shelf-life of the drug product. Deamidation occurs faster at higher pH and higher temperature and is more likely for asparagine than it is for glutamine. The amino acid following the asparagine or glutamine residue can also affect the rate of deamidation.

Selection of the proper endoproteinase is a key step in developing a method that can quantitate such a large number of potential impurities. In silico protein digestions should be performed with the goal of generating the largest number of peptides that contain only a single potential modification site. In the situation where a single endoproteinase will not generate an acceptable peptide map for the characterization and quantitation of degradation products, sequential or simultaneous digestions with multiple endoproteinases may need to be performed. Given the small mass difference between asparagine and aspartate residues that result from deamidation, quantitation is best done using high-resolution mass spectrometry. Scanning a mass range can allow for detection of multiple degradation products with a single injection using extracted ion chromatograms (XICs) of the native and modified peptides.

While mass spectrometric analysis of intact proteins can generally detect oxidation of methionine residues, it cannot determine its location within the protein. Further, mass spectrometric analysis of intact proteins, in general, cannot determine if a single deamidation has occurred due to the small mass change (+1 Da) in relation to the error in the deconvolution calculation used to convert the charge state envelope into the intact, deconvoluted mass. Peptide mapping with high-resolution mass spectrometry can provide both the peptide-specific location of the modification and the ability to observe small mass changes.

DISULFIDE BOND FORMATION

Correct disulfide bond formation can be critical to protein folding, enzyme activity, and proper binding in biologics. Formation of incorrect intra-molecular disulfide bonds can lead to mis-folded proteins and reduced activity or binding. Formation of incorrect inter-molecular disulfide bonds can lead to protein aggregation, which is of great immunogenicity concern to patients. A schematic of a typical mAb showing intra-molecular and inter-molecular disulfide bonds is provided in Figure 1.

During peptide mapping, disulfide bonds are typically reduced (using DTT, TCEP, or ß-ME) then alkylated (using Iodoacetamide or Iodoacetic acid) to prevent reformation of disulfide bonds during subsequent steps. Therefore, information regarding which disulfide bonds were present in the protein prior to peptide mapping is lost. However, non-reducing peptide maps can be performed allowing information regarding which disulfide bonds are present to be retained. In the non-reducing peptide map, the protein is denatured and alkylated without first reducing the disulfide bonds (or denaturated without alkylation), which maintains the already intact disulfide bonds. The non-reducing peptide map should be performed in parallel with the traditional reducing peptide map and analyzed using the same HPLC or UPLC method with MS detection. Any peptide that is not involved in a disulfide bond will have the same retention time and observed mass in the non-reduced peptide map as it does in the reduced peptide map. Peptides that are involved in disulfide bonds will have peaks with different retention times in the non-reduced peptide map from the retention times of the corresponding individual peptide peaks in the reduced peptide map. Comparison of the chromatograms between the reduced and non-reduced peptide maps should quickly reveal which peaks were involved in the disulfide bonds. A high-resolution mass spectrometer can then determine which peptides are linked through disulfide bonds using intact mass.

CONFIRMATION OF PROTEIN SEQUENCE BY MS/MS

In order to gain amino acid residue specific information from peptide mapping, sequencing of peptides by tandem mass spectrometry (MS/MS) is required. Peptides are sequenced using collision-induced dissociation (CID) to generate product ions from a single or set of precursor ions. Peptides will fragment in predictable ways along the peptide backbone to yield a series of product ions (Figure 2). The a, b, and c-series product ions provide information toward the N-terminal side of the peptide while the x, y, z-series ions provide information towards the C-terminal side of the peptide. The b-series and y-series ions can be used to sequence the peptide, confirming the presence of each amino acid.

For ideal peptide sequencing by MS/MS, peptides should be 5-20 amino acids in length. Peptides that are shorter than five amino acids are generally not retained on reversed-phase HPLC columns while peptides longer than 20 amino acids may not have full sequence coverage. In order to observe most of the b-series and y-series ions, it is helpful to have a positive charge on the N-terminus and C-terminus of the peptide. The amino group provides the charge at the N-terminus while having an Arginine or Lysine residue can provide the charge at the C-terminus. Therefore, trypsin or endoproteinase Lys-C are generally the most useful enzymes when MS/MS sequencing is required. For these reasons, selection of an appropriate endoproteinase is essential for MS/MS sequencing.

MS/MS sequencing can be used to determine the residue-specific location of a post-translation modification or degradation product. It can also be used as an orthogonal technique to N-terminal sequencing using Edman degradation. Advantages over Edman degradation are 1) MS/MS sequencing works even if the Nterminus is blocked (eg, acetylated, pyro-Glu, and 2) can provide sequencing information for most of a protein, while Edman degradation is typically limited to < 40 residues. MS/MS sequencing can even be used to identify unknown proteins by comparing peptide sequences to a protein library database.

GLYCOFORMS

One of the most complex post-translational modifications is glycosylation of proteins and mAbs. There are dozens of glycans ranging from high mannose glycans to hybrid glycans to complex glycans. Glycosylation can be important for protein function but may also play a role in immunogenicity. N-linked glycans are attached to the side-chain of an asparagine residue within the sequence Asn-X-Ser or Asn-X Thr where X is any amino acid except proline. O-linked glycans are attached to the side chains of serine and threonine residues and there is no known consensus sequence.

Monoclonal antibodies contain a single N-linked glycosylation site in each heavy chain while other proteins may contain multiple glycosylation sites. Glycan profiles are largely determined by culture conditions and the genotypes of host cells. Characterization of glycosylation is a critical step to demonstrating control over the bioprocess used to express proteins and mAbs.

Most N-linked glycans can be removed using PNGase F. The released glycans may be analyzed by MALDI-TOF or they can be labeled with a fluorescent tag and analyzed by hydrophilic interaction chromatography (HILIC) with mass spectrometry detection. Labeling of glycans with a fluorescent tag has several advantages. The labeled glycans can be chromatographically separated using HILIC and quantitated using highly sensitive fluorescence detection. The HILIC method should be compatible with mass spectrometry, allowing for the characterization of the labeled glycan by intact mass. Product ions resulting from the in-source fragmentation of glycosidic bonds are often observed. However, given the potential variation in stereochemistry, linkage site within the glycan, and the anomeric configuration of each monosaccharide, the product ions themselves are often not enough to determine the full structure of the glycan. Additional experiments are required to define the monosaccharides and their anomeric configurations. Toward this end, the glycans are treated with exoglycosidases that remove monosaccharides from the non-reducing terminus of the glycan. The exoglycosidases are specific to the stereochemistry, the anomeric configuration of the monosaccharide being released, and its linkage site within the remaining glycan. A full structural characterization of the various glycoforms is possible through this process.

Characterization of the glycosylation site within the protein can be performed using the aforementioned peptide mapping procedures. The glycosylated peptide should contain multiple masses corresponding to the various glycoforms. Deglycosylation of the protein prior to peptide mapping should result in a single peptide mass in place of the multiple masses observed for the glycosylated peptide. It should be noted that glycosylation may not change the retention time of the peptide as retention on a reversed-phase HPLC column is driven by hydrophobic interactions, to which the glycans contribute very little. The peptide map can also be analyzed using HILIC. Because the glycans are well retained in this separation technique, the glycopeptides are well separated from the non-glycosylated peptides.

SUMMARY

High-resolution mass spectrometry is a core technique for the characterization of biologics. It can provide a full range of characterization capabilities ranging from high-level analysis through intact mass analysis all the way down to residue specific information from MS/MS sequencing.

REFERENCES

1. McKinsey & Company 2014. Rapid growth in biopharma: challenges and opportunities. Website visited: https://www.mckinsey.com/industries/pharmaceuticals-and-medical-products/our-insights/rapid-growth-in-biopharma.

 To view this issue and all back issues online, please visit www.drug-dev.com.

Dr. William Boomershine is a Principal Scientist at Alcami Corporation. He earned his PhD in Biochemistry from The Ohio State University for his research on the solution structures of archaeal Ribonuclease P proteins using NMR. He has more than 11 years of experience in the pharmaceutical development and manufacturing industry characterizing small molecule APIs, peptides, and proteins using high-resolution mass spectrometry. He possesses a strong knowledge base in peptide and protein chemistry and development of analytical methods for peptides and proteins.