0151722 March 18,14 /Multiscale Metabolic Modeling of C4 PlantsModel ReconstructionOur model is the fourth published genome-scale metabolic reconstruction of jasp.12117 the major crop plant Zea mays, and the first such reconstruction developed solely from maize data sources, rather than as a direct or indirect adaptation of the Arabidopsis thaliana model AraGEM [21]. Direct reaction-to-reaction comparison of iEB5204 with C4GEM [45], iRS1563 [22], and its successor model [46] is difficult CV205-502 hydrochloride site Because those models use a naming scheme for compounds and reactions ultimately based on KEGG [47, 48] while this model, like its parent database, uses the nomenclature of MetaCyc and the BioCyc database collection. The models are broadly similar in size and biological scope. As published, C4GEM included 1588 reactions associated with 11623 maize genes; iRS1563, 1985 reactions associated with 1563 genes; the model of Simons et al. [46], 3892 unique reactions and 5824 genes; and iEB5204, 2720 reactions with 5204 genes. All models can simulate the production of similar sets of basic biomass constituents (including amino acids, carbohydrates, nucleic acids, lipids and fatty acids, and cell wall components) under photosynthetic and non-photosynthetic conditions and include key reactions of the C4 cycle. The model of Simons et al. [46] also offers extensive coverage of secondary metabolism. Our computational methods, discussed below, should allow the incorporation of realistic Rubisco kinetics into any of the prior genome-scale models of C4 plant metabolism. However, for the specific goal of integration with transcriptomics data from the leaf developmental gradient, we found it useful to develop the present model, which has several advantages: Gene associations The gene associations included in iEB5204 are those presented in CornCyc [26], which are generated by the PMN Ensemble Enzyme Prediction Pipeline (E2P2) [49], a homology-based protein sequence annotation algorithm trained on a reference dataset of experimentally validated enzyme sequences. The E2P2 approach is more comprehensive and scalable than the development procedures of the previous maize reconstructions (which involve, for example, obtaining gene associations by transferring annotations from Arabidopsis genes to their best maize BLAST hits and manually selecting annotations for remaining maize genes from among BLAST hits in other species.) The entire set of gene associations in the FBA model j.jebo.2013.04.005 may be readily updated based on improvements in the E2P2 prediction algorithm. High-confidence submodel In developing the AZD0156 web fitting algorithm we found that, to obtain plausible metabolic state predictions, a conservative reconstruction was preferable to a comprehensive one. For example, early tests with the comprehensive version of the model suggested that the fitting algorithm often found low-cost solutions involving high fluxes through reactions which, on investigation, we determined were unlikely to be active in maize. Because of the model’s connection to the CornCyc database, it was straightforward to create a reduced, high-confidence version of the model by preferentially excluding reactions not included in any manually curated plant metabolic pathway, even if candidate associated genes had been identified computationally, leading to more realistic results. Reproducibility In an effort to improve the reusability of the model and encourage its application to other data sets, we have provided the full source code (S1.0151722 March 18,14 /Multiscale Metabolic Modeling of C4 PlantsModel ReconstructionOur model is the fourth published genome-scale metabolic reconstruction of jasp.12117 the major crop plant Zea mays, and the first such reconstruction developed solely from maize data sources, rather than as a direct or indirect adaptation of the Arabidopsis thaliana model AraGEM [21]. Direct reaction-to-reaction comparison of iEB5204 with C4GEM [45], iRS1563 [22], and its successor model [46] is difficult because those models use a naming scheme for compounds and reactions ultimately based on KEGG [47, 48] while this model, like its parent database, uses the nomenclature of MetaCyc and the BioCyc database collection. The models are broadly similar in size and biological scope. As published, C4GEM included 1588 reactions associated with 11623 maize genes; iRS1563, 1985 reactions associated with 1563 genes; the model of Simons et al. [46], 3892 unique reactions and 5824 genes; and iEB5204, 2720 reactions with 5204 genes. All models can simulate the production of similar sets of basic biomass constituents (including amino acids, carbohydrates, nucleic acids, lipids and fatty acids, and cell wall components) under photosynthetic and non-photosynthetic conditions and include key reactions of the C4 cycle. The model of Simons et al. [46] also offers extensive coverage of secondary metabolism. Our computational methods, discussed below, should allow the incorporation of realistic Rubisco kinetics into any of the prior genome-scale models of C4 plant metabolism. However, for the specific goal of integration with transcriptomics data from the leaf developmental gradient, we found it useful to develop the present model, which has several advantages: Gene associations The gene associations included in iEB5204 are those presented in CornCyc [26], which are generated by the PMN Ensemble Enzyme Prediction Pipeline (E2P2) [49], a homology-based protein sequence annotation algorithm trained on a reference dataset of experimentally validated enzyme sequences. The E2P2 approach is more comprehensive and scalable than the development procedures of the previous maize reconstructions (which involve, for example, obtaining gene associations by transferring annotations from Arabidopsis genes to their best maize BLAST hits and manually selecting annotations for remaining maize genes from among BLAST hits in other species.) The entire set of gene associations in the FBA model j.jebo.2013.04.005 may be readily updated based on improvements in the E2P2 prediction algorithm. High-confidence submodel In developing the fitting algorithm we found that, to obtain plausible metabolic state predictions, a conservative reconstruction was preferable to a comprehensive one. For example, early tests with the comprehensive version of the model suggested that the fitting algorithm often found low-cost solutions involving high fluxes through reactions which, on investigation, we determined were unlikely to be active in maize. Because of the model’s connection to the CornCyc database, it was straightforward to create a reduced, high-confidence version of the model by preferentially excluding reactions not included in any manually curated plant metabolic pathway, even if candidate associated genes had been identified computationally, leading to more realistic results. Reproducibility In an effort to improve the reusability of the model and encourage its application to other data sets, we have provided the full source code (S1.