2. Building a Model

2.1. Model, Reactions and Metabolites

This simple example demonstrates how to create a model, create a reaction, and then add the reaction to the model.

We’ll use the ‘3OAS140’ reaction from the STM_1.0 model:

1.0 malACP[c] + 1.0 h[c] + 1.0 ddcaACP[c] \(\rightarrow\) 1.0 co2[c] + 1.0 ACP[c] + 1.0 3omrsACP[c]

First, create the model and reaction.

[1]:
from cobra import Model, Reaction, Metabolite
[2]:
model = Model('example_model')

reaction = Reaction('R_3OAS140')
reaction.name = '3 oxoacyl acyl carrier protein synthase n C140 '
reaction.subsystem = 'Cell Envelope Biosynthesis'
reaction.lower_bound = 0.  # This is the default
reaction.upper_bound = 1000.  # This is the default

We need to create metabolites as well. If we were using an existing model, we could use Model.get_by_id to get the appropriate Metabolite objects instead.

[3]:
ACP_c = Metabolite(
    'ACP_c',
    formula='C11H21N2O7PRS',
    name='acyl-carrier-protein',
    compartment='c')
omrsACP_c = Metabolite(
    'M3omrsACP_c',
    formula='C25H45N2O9PRS',
    name='3-Oxotetradecanoyl-acyl-carrier-protein',
    compartment='c')
co2_c = Metabolite('co2_c', formula='CO2', name='CO2', compartment='c')
malACP_c = Metabolite(
    'malACP_c',
    formula='C14H22N2O10PRS',
    name='Malonyl-acyl-carrier-protein',
    compartment='c')
h_c = Metabolite('h_c', formula='H', name='H', compartment='c')
ddcaACP_c = Metabolite(
    'ddcaACP_c',
    formula='C23H43N2O8PRS',
    name='Dodecanoyl-ACP-n-C120ACP',
    compartment='c')

Side note: SId

It is highly recommended that the ids for reactions, metabolites and genes are valid SBML identifiers (SId). SId is a data type derived from the basic XML typestring, but with restrictions about the characters permitted and the sequences in which those characters may appear.

letter   ::=   ’a’..’z’,’A’..’Z’
digit    ::=   ’0’..’9’
idChar   ::=   letter | digit | ’_’
SId      ::=   ( letter | ’_’ ) idChar*

The main limitation is that ids cannot start with numbers. Using SIds allows serialization to SBML. In addition features such as code completion and object access via the dot syntax will work in cobrapy.

Adding metabolites to a reaction uses a dictionary of the metabolites and their stoichiometric coefficients. A group of metabolites can be added all at once, or they can be added one at a time.

[4]:
reaction.add_metabolites({
    malACP_c: -1.0,
    h_c: -1.0,
    ddcaACP_c: -1.0,
    co2_c: 1.0,
    ACP_c: 1.0,
    omrsACP_c: 1.0
})

reaction.reaction  # This gives a string representation of the reaction
[4]:
'ddcaACP_c + h_c + malACP_c --> ACP_c + M3omrsACP_c + co2_c'

The gene_reaction_rule is a boolean representation of the gene requirements for this reaction to be active as described in Schellenberger et al 2011 Nature Protocols 6(9):1290-307. We will assign the gene reaction rule string, which will automatically create the corresponding gene objects.

[5]:
reaction.gene_reaction_rule = '( STM2378 or STM1197 )'
reaction.genes
[5]:
frozenset({<Gene STM1197 at 0x7fef047474c0>, <Gene STM2378 at 0x7fef04747dc0>})

At this point in time, the model is still empty

[6]:
print(f'{len(model.reactions)} reactions initially')
print(f'{len(model.metabolites)} metabolites initially')
print(f'{len(model.genes)} genes initially')
0 reactions initially
0 metabolites initially
0 genes initially

We will add the reaction to the model, which will also add all associated metabolites and genes

[7]:
model.add_reactions([reaction])

# The objects have been added to the model
print(f'{len(model.reactions)} reactions')
print(f'{len(model.metabolites)} metabolites')
print(f'{len(model.genes)} genes')
1 reactions
6 metabolites
2 genes

We can iterate through the model objects to observe the contents

[8]:
# Iterate through the the objects in the model
print("Reactions")
print("---------")
for x in model.reactions:
    print("%s : %s" % (x.id, x.reaction))

print("")
print("Metabolites")
print("-----------")
for x in model.metabolites:
    print('%9s : %s' % (x.id, x.formula))

print("")
print("Genes")
print("-----")
for x in model.genes:
    associated_ids = (i.id for i in x.reactions)
    print("%s is associated with reactions: %s" %
          (x.id, "{" + ", ".join(associated_ids) + "}"))
Reactions
---------
R_3OAS140 : ddcaACP_c + h_c + malACP_c --> ACP_c + M3omrsACP_c + co2_c

Metabolites
-----------
 malACP_c : C14H22N2O10PRS
      h_c : H
ddcaACP_c : C23H43N2O8PRS
    co2_c : CO2
    ACP_c : C11H21N2O7PRS
M3omrsACP_c : C25H45N2O9PRS

Genes
-----
STM2378 is associated with reactions: {R_3OAS140}
STM1197 is associated with reactions: {R_3OAS140}

2.2. Objective

Last we need to set the objective of the model. Here, we just want this to be the maximization of the flux in the single reaction we added and we do this by assigning the reaction’s identifier to the objective property of the model.

[9]:
model.objective = 'R_3OAS140'

The created objective is a symbolic algebraic expression and we can examine it by printing it

[10]:
print(model.objective.expression)
print(model.objective.direction)
1.0*R_3OAS140 - 1.0*R_3OAS140_reverse_60acb
max

which here shows that the solver will maximize the flux in the forward direction.

2.3. Model Validation

For exchange with other tools you can validate and export the model to SBML. For more information on serialization and available formats see the section “Reading and Writing Models”

[11]:
import tempfile
from pprint import pprint
from cobra.io import write_sbml_model, validate_sbml_model
with tempfile.NamedTemporaryFile(suffix='.xml') as f_sbml:
    write_sbml_model(model, filename=f_sbml.name)
    report = validate_sbml_model(filename=f_sbml.name)

pprint(report)
(<Model example_model at 0x7fef046eb160>,
 {'COBRA_CHECK': [],
  'COBRA_ERROR': [],
  'COBRA_FATAL': [],
  'COBRA_WARNING': [],
  'SBML_ERROR': [],
  'SBML_FATAL': [],
  'SBML_SCHEMA_ERROR': [],
  'SBML_WARNING': []})

The model is valid with no COBRA or SBML errors or warnings.

2.4. Exchanges, Sinks and Demands

Boundary reactions can be added using the model’s method add_boundary. There are three different types of pre-defined boundary reactions: exchange, demand, and sink reactions. All of them are unbalanced pseudo reactions, that means they fulfill a function for modeling by adding to or removing metabolites from the model system but are not based on real biology. An exchange reaction is a reversible reaction that adds to or removes an extracellular metabolite from the extracellular compartment. A demand reaction is an irreversible reaction that consumes an intracellular metabolite. A sink is similar to an exchange but specifically for intracellular metabolites, i.e., a reversible reaction that adds or removes an intracellular metabolite.

[12]:
print("exchanges", model.exchanges)
print("demands", model.demands)
print("sinks", model.sinks)
There are no boundary reactions in this model. Therefore specific types of boundary reactions such as 'exchanges', 'demands' or 'sinks' cannot be identified.
There are no boundary reactions in this model. Therefore specific types of boundary reactions such as 'exchanges', 'demands' or 'sinks' cannot be identified.
There are no boundary reactions in this model. Therefore specific types of boundary reactions such as 'exchanges', 'demands' or 'sinks' cannot be identified.
exchanges []
demands []
sinks []

Boundary reactions are defined on metabolites. First we add two metabolites to the model then we define the boundary reactions. We add glycogen to the cytosolic compartment c and CO2 to the external compartment e.

[13]:
model.add_metabolites([
    Metabolite(
    'glycogen_c',
    name='glycogen',
    compartment='c'
    ),
    Metabolite(
    'co2_e',
    name='CO2',
    compartment='e'
    ),
])
[14]:
# create exchange reaction
model.add_boundary(model.metabolites.get_by_id("co2_e"), type="exchange")
[14]:
Reaction identifierEX_co2_e
NameCO2 exchange
Memory address 0x07fef04703d90
Stoichiometry

co2_e <=>

CO2 <=>

GPR
Lower bound-1000.0
Upper bound1000.0
[15]:
# create exchange reaction
model.add_boundary(model.metabolites.get_by_id("glycogen_c"), type="sink")
[15]:
Reaction identifierSK_glycogen_c
Nameglycogen sink
Memory address 0x07fef046eb2e0
Stoichiometry

glycogen_c <=>

glycogen <=>

GPR
Lower bound-1000.0
Upper bound1000.0
[16]:
# Now we have an additional exchange and sink reaction in the model
print("exchanges", model.exchanges)
print("sinks", model.sinks)
print("demands", model.demands)
exchanges [<Reaction EX_co2_e at 0x7fef04703d90>]
sinks [<Reaction SK_glycogen_c at 0x7fef046eb2e0>]
demands []

To create a demand reaction instead of a sink use type demand instead of sink.

Information on all boundary reactions is available via the model’s property boundary.

[17]:
# boundary reactions
model.boundary
[17]:
[<Reaction EX_co2_e at 0x7fef04703d90>,
 <Reaction SK_glycogen_c at 0x7fef046eb2e0>]

A neat trick to get all metabolic reactions is

[18]:
# metabolic reactions
set(model.reactions) - set(model.boundary)
[18]:
{<Reaction R_3OAS140 at 0x7fef04737d00>}