How Can Mathematical Modelling Support the Development of Next Generation Bioprocesses?

Cleo Kontoravdi - Department of Chemical Engineering, Imperial College London

The adoption of high-throughput experimentation systems and advances in multi-omics analysis methods are generating large volumes of bioprocessing data that can support the design of next generation cell culture systems. The multiscale nature of these datasets mandates the use of mathematical modelling for their interpretation. Models can be mechanistic, e.g., kinetic or stoichiometric, statistical/ machine learning, e.g., multivariate analysis, neural networks, or hybrids of the two. The application of model-based analysis aims to advance our understanding of what drives cellular behavior, design process optimization strategies or suggest cell engineering targets. 

Culture phenotype is typically a synthesis of different intracellular processes, which can have the same or opposing objectives. Modelling can be used to decipher and quantify the contribution of each pathway towards the observed population behavior. For example, mechanistic modelling has been employed to dissect the pathways that contribute to an overall reduction in monoclonal antibody (mAb) galactosylation under mild hypothermic Chinese hamster ovary (CHO) cell culture conditions compared to culture at physiological temperature (Sou et al., 2017). The model correctly identified that reduced metabolic function and increased mAb productivity, both of which are common measurements, were insufficient to explain the magnitude of the observed drop in galactosylation levels. It pointed to a reduced availability of glycosyltransferases in the Golgi apparatus, the expression of which is not typically monitored, which was confirmed experimentally.  

Process design and optimization have long been the subject of model-based studies. Although earlier studies focused on computationally tractable, simple models based on Monod kinetics, computational strategies now involve two main directions. The first is a purely data-driven approach that links process conditions to protein yield and quality attributes. For example, partial least squares analysis has been used to describe the relationship between media composition and mAb glycosylation and the models developed employed to improve process performance by means of a genetic algorithm (Sokolov et al., 2017). Sequential multivariate cell culture modelling has further been used to describe the effect process conditions on mAb glycosylation profile, aggregation level, low molecular weight forms and charge isoforms across four different scales, from deep-well plates to pilot-scale bioreactors (Sokolov et al., 2018). This approach has been demonstrated for large datasets (e.g., Le et al., 2012; Zürcher et al., 2020) and is therefore well suited for the analysis of data from parallel bioreactor systems such as the Ambr® or historical datasets. Additional benefits are that statistical modelling is accessible to a wider range of biopharma professionals and amenable to any operating mode, including continuous cell cultivation. 

An alternative approach is the use of mechanistic modelling. Kinetic models can be used for process performance prediction and process optimization. For example, Kotidis et al. (2019a) employed a kinetic model of CHO cell growth, metabolism and mAb glycosylation to optimize the feeding schedule of metabolic precursors linked to increased galactosylation. The same model was used to explore the operating parameter space and identify the range of feeding conditions that met both yield and quality criteria (Kotidis et al., 2019b). However, such models require considerable formulation and parameterization effort and have to be tailored to each cell line and process. In addition, they cannot describe unmeasured intracellular phenomena, such as enzyme regulation events, which affect the model output, leading to reduced accuracy under certain conditions. This kind of information is captured in input-output data but not explicitly included in kinetic models if there is no prior knowledge of such events taking place. An option for overcoming this limitation is to use hybrid kinetic and machine learning/statistical models (e.g., Kotidis et al., 2020) that harness the full information content of data. This approach is particularly suitable for describing poorly understood phenomena that lack first-principles understanding or systems for which intracellular measurements are scarce or unavailable. 

The industry’s ability to characterize candidate cell lines in-depth using transcriptomic, metabolomic and, in some cases, proteomic data calls for a comprehensive analysis tool that mechanistically describes a broad range of intracellular functions: genome-scale modelling. There are several genome-scale models of CHO cells (e.g., Hefzi et al., 2016; Calmels et al., 2019; Yeo et al., 2020, etc.) that consist of stoichiometric reaction network representations and gene-protein- reaction associations. They can be used for analyzing intracellular flux distribution under different process conditions, comparing different clones to identify traits of high-producers and guiding the synthesis of designer cells. Crucially, genome-scale models come with an arsenal of tools for their curation using metabolomic and transcriptomic data (e.g., Opdam et al., 2017; Becker and Palsson, 2008; Schultz and Qutub, 2016). This is essential for creating bespoke models that correctly describe the cell line at hand, but are also of reduced size compared to the original genome-scale network of reactions. Beyond the typical metabolic flux balance analysis, of particular note is the study of Kol et al. (2020), in which a genome-scale model of CHO cells was extended to include the protein secretory pathway (Gutierrez et al., 2020) and used to identify burdensome host cell proteins. These were subsequently knocked out to create a cleaner downstream feedstock.  

Genome-scale models have two important mathematical characteristics: (i) they are underdetermined and therefore need to be solved as an optimization problem (known as flux balance analysis), and (ii) they are applied under a quasi-steady-state assumption. These, in turn, create certain limitations. Solving genome-scale models through optimization involves the choice of an objective function. Although maximization of biomass is an objective function favored in microbial cell systems, it is not representative of the priorities of mammalian cells across all culture phases. The identification of suitable objective function(s) is hindered by the fact that validation of predictions is difficult. Although possible using, for example, carbon labelling experiments, the number of metabolites that can be readily and reliably measured is too low to reach a unique solution for intracellular fluxes. 

The second limitation of piece-wise applicability of genome-scale models to cell culture intervals rather than the whole culture duration remains an open research question. Smaller metabolic networks have been dynamicised. For example, in a seminal paper by Nolan and Lee (2011), the authors presented a hybrid stoichiometric-kinetic model that accurately generates time-course predictions of the effect of temperature shift, seed density, specific productivity, and nutrient concentrations on key culture performance indicators. Similarly, Martínez et al. (2015) developed a dynamic metabolic flux analysis framework using B-splines to successfully analyze the effect of culture temperature on cell growth and mAb productivity. The approach quantitatively captures extracellular concentration profiles of key process parameters. 

An interesting recent advance is the development of hybrid stoichiometric/data-driven models. Schinn et al. (2021) analyzed amino acid consumption rates using a genome-scale metabolic model and used the results to train a statistical model. The resulting hybrid model successfully predicted time-course amino acid concentrations, paving the way for process optimization and control. Similarly, Antonakoudis et al. (2021) selected a small-scale metabolic model (presented separately in del Val et al., 2021) to predict intracellular metabolic fluxes, which were used to train a machine learning model of mAb glycosylation. Their hybrid formulation accurately predicted changes in glycosylation profile across cell culture time for an independent dataset. 

The above studies illustrate how each modelling strategy and hybrid formulations advance our understanding of intracellular bottlenecks for recombinant protein synthesis, provide a framework for analyzing differences between clones and culture conditions and provide clues for cell and process engineering. The next frontier is to demonstrate how modelling can underpin the industry’s efforts to tackle two topical challenges: the shift to continuous manufacturing and repeating the success of antibody manufacturing platforms for new modalities. In both cases, modelling can be used to ascertain whether knowledge is transferable across culture modes and different products and what the key process and product characteristics that determine manufacturability of a new molecule are.

References 

  1. Antonakoudis A, Strain B, Barbosa R, del Val IJ, Kontoravdi C (2021). Synergising stoichiometric modelling with artificial neural networks to predict antibody glycosylation patterns in Chinese hamster ovary cells. Computers & Chemical Engineering, 154, 107471. 
  2. Becker SA and BO Palsson (2008). Context-specific metabolic networks are consistent with experiments. PLoS Comput. Biol. 4, e1000082. 
  3. Calmels C, McCann A, Malphettes L, Andersen MR (2019). Application of a curated genome-scale metabolic model of CHO DG44 to an industrial fed-batch process. Metabolic Engineering 51, 9-19. 
  4. del Val IJ, Kyriakopoulos S, Albrecht S, Stockmann H, Rudd PM, Kontoravdi C (2021). CHOmpact: a reduced metabolic model of Chinese hamster ovary cells with enhanced interpretability. bioRxiv https://doi.org/10.1101/2021.07.19.452953 
  5. Gutierrez JM, Feizi A, Li S et al. (2020). Genome-scale reconstructions of the mammalian secretory pathway predict metabolic costs and limitations of protein secretion. Nature communications 11 (1), 1-10. 
  6. Hefzi H, Ang KS, Hanscho M, et al. (2016). A consensus genome-scale reconstruction of Chinese hamster ovary cell metabolism. Cell systems 3(5), 434-443. e8. 
  7. Kol, S., Ley, D., Wulff, T., Decker, M., Arnsdorf, J., Schoffelen, S., Hansen, A.H., Gutierrez, J.M., Chiang, A.W.T., Masson, H.O., Palsson, B.O., Voldborg, B.G., Pedersen, L.E., Kildegaard, H.F., Lee, G.M., Lewis, N.E. (2020). Multiplex secretome engineering enhances recombinant protein production and purity. Nature Communications, 11:1908. 
  8. Kotidis P, Jedrzejewski P, Sou SN, Sellick C, Polizzi KM, Del Val IJ, Kontoravdi C (2019a). Model‐based optimization of antibody galactosylation in CHO cell culture. Biotechnology and bioengineering 116(7), 1612-1626. 
  9. Kotidis P, Demis P, Goey CH, Correa E, McIntosh CM, Trepekli S, Shah N, Klymenko OV, Kontoravdi C (2019b). Constrained global sensitivity analysis for bioprocess design space identification. Computers & Chemical Engineering 125, 558-568. 
  10. Kotidis P, Kontoravdi C (2020). Harnessing the potential of artificial neural networks for predicting protein glycosylation. Metabolic engineering communications 10, e00131.  
  11. Le H, Kabbur S, Pollastrini L, Sun Z, Mills K, Johnson K, Karypis G, Hu WS (2012). Multivariate analysis of cell culture bioprocess data--lactate consumption as process indicator. Journal of Biotechnology 162(2-3):210-23. 
  12. Martínez VS, Buchsteiner M, Gray P, Nielsen LK, Quek LE (2015). Dynamic metabolic flux analysis using B-splines to study the effects of temperature shift on CHO cell metabolism. Metabolic Engineering Communications 2, 46-57. 
  13. Nolan RP and K Lee (2011). Dynamic model of CHO cell metabolism. Metabolic Engineering 13(1), 108-124. 
  14. Opdam S, Richelle A, Kellman B, Li S, Zielinski DC, Lewis NE (2017). A systematic evaluation of methods for tailoring genome-scale metabolic models. Cell systems 4(3), 318-329. e6 
  15. Schinn SM, Morrison C, Wei W, Zhang L, Lewis NE (2021). A genome-scale metabolic network model and machine learning predict amino acid concentrations in Chinese Hamster Ovary cell cultures. Biotechnology and Bioengineering, 118, 2118-2123. 
  16. Schultz A and AA Qutub (2016). Reconstruction of Tissue-Specific Metabolic Networks Using CORDA. PLoS Computational Biology 12(3), e1004808. 
  17. Sokolov M, Ritscher J, Mackinnon N, Souquet J, Broly H, Morbidelli M, Butté A (2017). Enhanced process understanding and multivariate prediction of the relationship between cell culture process and monoclonal antibody quality. Biotechnology progress 33(5), 1368- 1380. 
  18. Sokolov M, Morbidelli M, Butté A, Souquet J, Broly H (2018). Sequential multivariate cell culture modelling at multiple scales supports systematic shaping of a monoclonal antibody toward a quality target. Biotechnology Journal, 13, 1700461. 
  19. Sou SN, Jedrzejewski PM, Lee K, Sellick C, Polizzi KM, Kontoravdi C (2017). Model‐based investigation of intracellular processes determining antibody Fc‐glycosylation under mild hypothermia. Biotechnology and Bioengineering 114 (7), 1570-1582. 
  20. Yeo HC, Hong J, Lakshmanan M, Lee DY (2020). Enzyme capacity-based genome scale modelling of CHO cells. Metabolic Engineering 60, 138-147. 
  21. Zürcher P, Sokolov M, Brühlmann D, Ducommun R, Stettler M, Souquet J, Jordan M, Broly H, Morbidelli M, Butté A (2020). Cell culture process metabolomics together with multivariate data analysis tools opens new routes for bioprocess development and glycosylation prediction. Biotechnology Progress 36(5), e3012.

Subscribe to our e-Newsletters
Stay up to date with the latest news, articles, and events. Plus, get special offers
from American Pharmaceutical Review – all delivered right to your inbox! Sign up now!

  • <<
  • >>

Join the Discussion