Enzyme cost minimization

Main  |  Model  |  Workflow  |  Data and software  |  Project "Enzyme dynamics and function"

Workflow description


The algorithm for enzyme cost minimization consists of two main phases:

Kinetics phase

In the kinetics phase, we collect and adjust the model parameters and construct a model with energetically consistent fluxes (exclusion of infeasible cycles) and rate constants (satisfying Haldane relationships and Wegscheider conditions). To determine consistent model parameters, the collected rate constants and equilibrium constants are adjusted and completed by parameter balancing.

  1. Collect thermodynamic and kinetic data: standard chemical potentials mu , equilibrium constants Keq , Michaelis-Menten constants K M , forward and reverse catalytic constants kcat and kcat.
  2. Set some of these quantities to fixed values (if desired).
  3. Run parameter balancing (with priors, pseudo values, and upper and lower bounds) to obtain a complete, consistent set of rate constants.

Optimization phase

In the optimization phase, the desired pathway flux is realized by optimal enzyme and metabolite profiles.

  1. Set up the kinetic model (based on the given network, flux profile v, and model parameters). Redefine the reaction directions such that fluxes are positive, and update all parameters.
  2. Choose the bounds for metabolite concentrations (tight bounds or fixed values for metabolites with fixed concentrations, lower and upper bounds for the others).
  3. Determine a feasible metabolite profile s = ln c (a profile within the metabolite polytope) as a starting point for numerical optimization. We consider three alternatives: (i) Use linear programming to construct a set of extreme points in the polytope (with maximal and minimal metabolite levels s i ); the center of mass of these points is then taken as the starting point. (ii) Use the point in the polytope that is closest to the center of the predefined metabolite bounds (solution of a quadratic programming problem) as the starting point. (iii) Use the solution of the MDF problem (linear programming problem) as the starting point.
  4. Choose an EMC function and minimize it numerically with respect to s under the constraints defining the metabolite polytope.
  5. Compute the corresponding enzyme levels and cost.
  6. Based on the optimal enzyme cost, define a maximal tolerable cost (e.g., one percent higher than the optimal total cost) and compute individual tolerances for metabolite and enzyme levels as described in Methods.
  7. Validate the predicted enzyme and metabolite levels with experimental data.

In theory, a convex optimization should converge without problems. As a check, we can repeat the calculation with different starting points.

How to run the workflow in practice: usage examples

Different software tools for Enzyme Cost Minimization (in Matlab, python, and for the NEOS online optimization server) are provided on this website. To get used to the method and the file formats used, you may run ECM for our example model as described below. To run ECM for your own models and data, you just need to create input files in the same format. For further options of the software tools, which are not mentioned here, please refer to the code documentation.

Data formats

Our ECM code can handle different data formats for models and numerical data. For simplicity, we refer here only to one format, in which model and data are stored in a single SBtab data file. Below we will call this the "Model and Data file". An example, which is also used as a running example below, is the file ecoli_ccm_ProteinUniform_Haverkorn_ModelData.tsv, which can be found as the "Model and Data [SBtab]" file on the E. coli model page. To run ECM for your own models, you just need to prepare all information in the same file format.

Matlab example

This is how you can run our example ECM task using the Matlab code. After installing the Matlab functions for ECM and downloading the Model and Data file
ecoli_ccm_ProteinUniform_Haverkorn_ModelData.tsv, run

% This creates a temporary file directory; you can also choose a different directory path.

tmp_dir = '/tmp/emc'; mkdir(tmp_dir);

% This sets the file location of your Models and Data file; you can choose a different location.

filename = 'ecoli_ccm_ProteinUniform_Haverkorn_ModelData.tsv';

% This loads the model and data from the input file and translates them into
% matlab data structures (see the documentation of the Metabolic Network Toolbox for details)

[network,v,c_data,u_data, conc_min, conc_max, met_fix, conc_fix,positions, enzyme_cost_weights, warnings] = ecm_load_model_and_data_sbtab(filename, tmp_dir);

% This defines some default options for ECM; to change the options, refer to the documentation

ecm_options = ecm_default_options(network, 'My example model');

ecm_options.c_data = c_data;
ecm_options.u_data = u_data;

ecm_options = ecm_update_options(network, ecm_options);

% Now ECM is run

[c, u, u_cost, up, A_forward, mca_info, c_min, c_max, u_min, u_max, r, u_capacity, eta_energetic, eta_saturation] = ecm_enzyme_cost_minimization(network, network.kinetics, v, ecm_options);

% You may use this command to save all results as SBtab files (again, the file path can be changed)

document_name = 'E. coli central carbon metabolism - ECM result';
outfile_name = 'ecoli_ccm_ProteinUniform_Haverkorn_ECM_results';
opt = struct('r', network.kinetics, 'method', 'emc4cm', 'document_name', document_name, 'save_tolerance_ranges', 1);
ecm_save_result_sbtab(outfile_name, network, c, u, A_forward, opt, c_min, c_max, u_min, u_max, u_capacity, eta_energetic, eta_saturation);

% To display graphical output, use the following lines:

kinetic_data = [];
ecm_options.show_graphics = 1;
graphics_options.print_graphics = 1;
graphics_options.few_graphics = 1;
graphics_options.metabolite_order_file = [];
graphics_options.reaction_order_file = [];
graphics_options.enzyme_colors = sunrise_colors(length(ecm_options.ind_scored_enzymes));
ecm_display(ecm_options, graphics_options, network,v,c,u,u_cost,up,A_forward,r,kinetic_data,c_min,c_max,u_min,u_max,u_capacity,eta_energetic,eta_saturation);

You can find the same commands in the demo script

Python example

1 Install the Python functions for the Component Contribution Method (github project component-contribution)
2 Install the Python functions for ECM (github project enzyme-cost)
3 Run ecoli_ccm_aerobic.py

NEOS Optimization Server example

The NEOS Optimization server requires a input files in either of the two following formats:

  1. A number of separate comma-separated (csv) files ("NEOS files") describing model and data
  2. A single table file describing model and data. This is NOT the "Model and Data" SBtab format mentioned before. For the file syntax, please refer to the instructions on the NEOS ECM website.

Depending on the input file format used, you may run NEOS ECM in three different places:

  1. Run NEOS ECM with separate .csv NEOS files
  2. Run NEOS ECM with one zip-file containing all .csv NEOS files
  3. Run NEOS ECM with single-file table format

Usage example: To run ECM on the NEOS server, download the zipped .csv NEOS files from ("Model and data [NEOS files]") from the example page. Now you can proceed in one of the two following ways:

  1. Directly upload the file to https://proto.neos-server.org/neos/solvers/application:MER/zip.html and run ECM
  2. Unzip the file, upload it to https://proto.neos-server.org/neos/solvers/application:MER/csv.html, and run ECM

If you already prepared a "Model and Data" SBtab file for your model and data, you can build the .csv files automatically by using the following Matlab commands:

% Path of temporary files directory

tmp_dir = '/tmp/emc'; mkdir(tmp_dir);

% Directory location for output files; you can change this location

neos_directory = '~/Desktop/';
[network,v,c_data,u_data, conc_min, conc_max, met_fix, conc_fix, positions, enzyme_cost_weights, warnings] = ecm_load_model_and_data_sbtab(filename, tmp_dir);

ecm_save_model_and_data_neos(neos_directory, network, v, network.kinetics, c_data, u_data, enzyme_cost_weights, conc_min, conc_max, met_fix, conc_fix);