Almost 60% of commercialized pharmaceutical proteins are glycosylated. Glycosylation is considered a critical quality attribute, as it affects the stability, bioactivity and safety of proteins. Hence, the development of analytical methods to characterise the composition and structure of glycoproteins is crucial. Currently, existing methods are time-consuming, expensive, and require significant sample preparation steps, which can alter the robustness of the analyses. In this work, we suggest the use of a fast, direct, and simple Fourier transform infrared spectroscopy (FT-IR) combined with a chemometric strategy to address this challenge. In this context, a database of FT-IR spectra of glycoproteins was built, and the glycoproteins were characterised by reference methods (MALDI-TOF, LC-ESI-QTOF and LC-FLR-MS) to estimate the mass ratio between carbohydrates and proteins and determine the composition in monosaccharides. The FT-IR spectra were processed first by Partial Least Squares Regression (PLSR), one of the most used regression algorithms in spectroscopy and secondly by Support Vector Regression (SVR). SVR has emerged in recent years and is now considered a powerful alternative to PLSR, thanks to its ability to flexibly model nonlinear relationships. The results provide clear evidence of the efficiency of the combination of FT-IR spectroscopy, and SVR modelling to characterise glycosylation in therapeutic proteins. The SVR models showed better predictive performances than the PLSR models in terms of RMSECV, RMSEP, R2CV, R2Pred and RPD. This tool offers several potential applications, such as comparing the glycosylation of a biosimilar and the original molecule, monitoring batch-to-batch homogeneity, and in-process control.
Glycosylation is considered a critical quality attribute of therapeutic proteins as it affects their stability, bioactivity, and safety. Hence, the development of analytical methods able to characterize the composition and structure of glycoproteins is crucial. Existing methods are time consuming, expensive, and require significant sample preparation, which can alter the robustness of the analyses. In this context, we developed a fast, direct, and simple drop-coating deposition Raman imaging (DCDR) method combined with multivariate curve resolution alternating least square (MCR-ALS) to analyze glycosylation in monoclonal antibodies (mAbs). A database of hyperspectral Raman imaging data of glycoproteins was built, and the glycoproteins were characterized by LC-FLR-MS as a reference method to determine the composition in glycans and monosaccharides. The DCDR method was used and allowed the separation of excipient and protein by forming a "coffee ring". MCR-ALS analysis was performed to visualize the distribution of the compounds in the drop and to extract the pure spectral components. Further, the strategy of SVD-truncation was used to select the number of components to resolve by MCR-ALS. Raman spectra were processed by support vector regression (SVR). SVR models showed good predictive performance in terms of RMSECV, R2CV.