1. Getting Started

1.1. Loading a model and inspecting it

To begin with, cobrapy comes with bundled models for Salmonella and E. coli, as well as a “textbook” model of E. coli core metabolism. To load a test model, type

[1]:
import cobra
from cobra.io import load_model

# "iJO1366" and "salmonella" are also valid arguments
model = load_model("textbook")
Scaling...
 A: min|aij| =  1.000e+00  max|aij| =  1.000e+00  ratio =  1.000e+00
Problem data seem to be well scaled

The reactions, metabolites, and genes attributes of the cobrapy model are a special type of list called a cobra.DictList, and each one is made up of cobra.Reaction, cobra.Metabolite and cobra.Gene objects respectively.

[2]:
print(len(model.reactions))
print(len(model.metabolites))
print(len(model.genes))
95
72
137

When using Jupyter notebook this type of information is rendered as a table.

[3]:
model
[3]:
Name e_coli_core
Memory address 0x07feb1af21f28
Number of metabolites 72
Number of reactions 95
Number of groups 0
Objective expression 1.0*Biomass_Ecoli_core - 1.0*Biomass_Ecoli_core_reverse_2cdba
Compartments cytosol, extracellular

Just like a regular list, objects in the DictList can be retrieved by index. For example, to get the 30th reaction in the model (at index 29 because of 0-indexing):

[4]:
model.reactions[29]
[4]:
Reaction identifierEX_glu__L_e
NameL-Glutamate exchange
Memory address 0x7feb1b012eb8
Stoichiometry

glu__L_e -->

L-Glutamate -->

GPR
Lower bound0.0
Upper bound1000.0

Additionally, items can be retrieved by their id using the DictList.get_by_id() function. For example, to get the cytosolic atp metabolite object (the id is “atp_c”), we can do the following:

[5]:
model.metabolites.get_by_id("atp_c")
[5]:
Metabolite identifieratp_c
NameATP
Memory address 0x7feb1af41550
FormulaC10H12N5O13P3
Compartmentc
In 13 reaction(s) Biomass_Ecoli_core, ATPS4r, PGK, ACKr, ADK1, GLNS, ATPM, PFK, PPS, GLNabc, PYK, PPCK, SUCOAS

As an added bonus, users with an interactive shell such as IPython will be able to tab-complete to list elements inside a list. While this is not recommended behavior for most code because of the possibility for characters like “-” inside ids, this is very useful while in an interactive prompt:

[6]:
model.reactions.EX_glc__D_e.bounds
[6]:
(-10.0, 1000.0)

1.2. Reactions

We will consider the reaction glucose 6-phosphate isomerase, which interconverts glucose 6-phosphate and fructose 6-phosphate. The reaction id for this reaction in our test model is PGI.

[7]:
pgi = model.reactions.get_by_id("PGI")
pgi
[7]:
Reaction identifierPGI
Nameglucose-6-phosphate isomerase
Memory address 0x7feb1b09c2e8
Stoichiometry

g6p_c <=> f6p_c

D-Glucose 6-phosphate <=> D-Fructose 6-phosphate

GPRb4025
Lower bound-1000.0
Upper bound1000.0

We can view the full name and reaction catalyzed as strings

[8]:
print(pgi.name)
print(pgi.reaction)
glucose-6-phosphate isomerase
g6p_c <=> f6p_c

We can also view reaction upper and lower bounds. Because the pgi.lower_bound < 0, and pgi.upper_bound > 0, pgi is reversible.

[9]:
print(pgi.lower_bound, "< pgi <", pgi.upper_bound)
print(pgi.reversibility)
-1000.0 < pgi < 1000.0
True

The lower and upper bound of reactions can also be modified, and the reversibility attribute will automatically be updated. The preferred method for manipulating bounds is using reaction.bounds, e.g.

[10]:
old_bounds = pgi.bounds
pgi.bounds = (0, 1000.0)
print(pgi.lower_bound, "< pgi <", pgi.upper_bound)
print("Reversibility after modification:", pgi.reversibility)
pgi.bounds = old_bounds
print("Reversibility after resetting:", pgi.reversibility)
0 < pgi < 1000.0
Reversibility after modification: False
Reversibility after resetting: True

Bounds can also be modified one-at-a-time using reaction.lower_bound or reaction.upper_bound. This approach is less desirable than setting both bounds simultaneously with reaction.bounds because a user might accidently set a lower bound higher than an upper bound (or vice versa). If a lower bound is higher than upper bound (or vice versa), this will lead to an error.

[11]:
old_bounds = pgi.bounds
print('Upper bound prior to setting new lower bound:', pgi.upper_bound)
pgi.lower_bound = 1100
print('Upper bound after setting new lower bound:', pgi.upper_bound)
pgi.bounds = old_bounds
Upper bound prior to setting new lower bound: 1000.0
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/var/folders/v7/99h1ft7s0mzg1_83fmbc8mnm0000gn/T/ipykernel_23255/4168182522.py in <module>
      1 old_bounds = pgi.bounds
      2 print('Upper bound prior to setting new lower bound:', pgi.upper_bound)
----> 3 pgi.lower_bound = 1100
      4 print('Upper bound after setting new lower bound:', pgi.upper_bound)
      5 pgi.bounds = old_bounds

~/PycharmProjects/cobrapy/src/cobra/util/context.py in wrapper(self, new_value)
    109             context(partial(func, self, old_value))
    110
--> 111         func(self, new_value)
    112
    113     return wrapper

~/PycharmProjects/cobrapy/src/cobra/core/reaction.py in lower_bound(self, value)
    368         """
    369         # Validate bounds before setting them.
--> 370         self._check_bounds(value, self._upper_bound)
    371         self._lower_bound = value
    372         self.update_variable_bounds()

~/PycharmProjects/cobrapy/src/cobra/core/reaction.py in _check_bounds(lb, ub)
    292         if lb > ub:
    293             raise ValueError(
--> 294                 f"The lower bound must be less than or equal to the upper bound "
    295                 f"({lb} <= {ub})."
    296             )

ValueError: The lower bound must be less than or equal to the upper bound (1100 <= 1000.0).

We can also ensure the reaction is mass balanced. This function will return elements which violate mass balance. If it comes back empty, then the reaction is mass balanced.

[12]:
pgi.check_mass_balance()
[12]:
{}

In order to add a metabolite, we pass in a dict with the metabolite object and its coefficient

[13]:
pgi.add_metabolites({model.metabolites.get_by_id("h_c"): -1})
pgi.reaction
[13]:
'g6p_c + h_c <=> f6p_c'

The reaction is no longer mass balanced

[14]:
pgi.check_mass_balance()
[14]:
{'charge': -1.0, 'H': -1.0}

We can remove the metabolite, and the reaction will be balanced once again.

[15]:
pgi.subtract_metabolites({model.metabolites.get_by_id("h_c"): -1})
print(pgi.reaction)
print(pgi.check_mass_balance())
g6p_c <=> f6p_c
{}

It is also possible to build the reaction from a string. However, care must be taken when doing this to ensure reaction id’s match those in the model. The direction of the arrow is also used to update the upper and lower bounds.

[16]:
pgi.reaction = "g6p_c --> f6p_c + h_c + green_eggs + ham"
unknown metabolite 'green_eggs' created
unknown metabolite 'ham' created
[17]:
pgi.reaction
[17]:
'g6p_c --> f6p_c + green_eggs + h_c + ham'
[18]:
pgi.reaction = "g6p_c <=> f6p_c"
pgi.reaction
[18]:
'g6p_c <=> f6p_c'

1.3. Metabolites

We will consider cytosolic atp as our metabolite, which has the id "atp_c" in our test model.

[19]:
atp = model.metabolites.get_by_id("atp_c")
atp
[19]:
Metabolite identifieratp_c
NameATP
Memory address 0x7feb1af41550
FormulaC10H12N5O13P3
Compartmentc
In 13 reaction(s) Biomass_Ecoli_core, ATPS4r, PGK, ACKr, ADK1, GLNS, ATPM, PFK, PPS, GLNabc, PYK, PPCK, SUCOAS

We can print out the metabolite name and compartment (cytosol in this case) directly as string.

[20]:
print(atp.name)
print(atp.compartment)
ATP
c

We can see that ATP is a charged molecule in our model.

[21]:
atp.charge
[21]:
-4

We can see the chemical formula for the metabolite as well.

[22]:
print(atp.formula)
C10H12N5O13P3

The reactions attribute gives a frozenset of all reactions using the given metabolite. We can use this to count the number of reactions which use atp.

[23]:
len(atp.reactions)
[23]:
13

A metabolite like glucose 6-phosphate will participate in fewer reactions.

[24]:
model.metabolites.get_by_id("g6p_c").reactions
[24]:
frozenset({<Reaction Biomass_Ecoli_core at 0x7feb1afd4240>,
           <Reaction G6PDH2r at 0x7feb1b04bf60>,
           <Reaction GLCpts at 0x7feb1b050b38>,
           <Reaction PGI at 0x7feb1b09c2e8>})

1.4. Genes

The gene_reaction_rule is a boolean representation of the gene requirements for this reaction to be active as described in Schellenberger et al 2011 Nature Protocols 6(9):1290-307.

The GPR is stored as the GPR class in the gpr for a Reaction. A string representation of it is stored as the gene_reaction_rule for a Reaction object.

[25]:
gpr = pgi.gpr
print(gpr)
gpr_string = pgi.gene_reaction_rule
print(gpr_string)
b4025
b4025

Corresponding gene objects also exist. These objects are tracked by the reactions itself, as well as by the model

[26]:
pgi.genes
[26]:
frozenset({<Gene b4025 at 0x7feb1afb0f98>})
[27]:
pgi_gene = model.genes.get_by_id("b4025")
pgi_gene
[27]:
Gene identifierb4025
Namepgi
Memory address 0x7feb1afb0f98
FunctionalTrue
In 1 reaction(s) PGI

Each gene keeps track of the reactions it catalyzes

[28]:
pgi_gene.reactions
[28]:
frozenset({<Reaction PGI at 0x7feb1b09c2e8>})

Altering the gene_reaction_rule will create new gene objects if necessary and update all relationships.

[29]:
pgi.gene_reaction_rule = "(spam or eggs)"
pgi.genes
[29]:
frozenset({<Gene eggs at 0x7feb1b0b0cc0>, <Gene spam at 0x7feb1b0aacf8>})
[30]:
pgi_gene.reactions
[30]:
frozenset()

Newly created genes are also added to the model

[31]:
model.genes.get_by_id("spam")
[31]:
Gene identifierspam
Name
Memory address 0x7feb1b0aacf8
FunctionalTrue
In 1 reaction(s) PGI

The knock_out_model_genes function will evaluate the GPR and set the upper and lower bounds to 0 if the reaction is knocked out.

[32]:
cobra.manipulation.knock_out_model_genes(
    model, ["spam"])
print("after 1 KO: %4d < flux_PGI < %4d" % (pgi.lower_bound, pgi.upper_bound))

cobra.manipulation.knock_out_model_genes(
    model, ["eggs"])
print("after 2 KO:  %4d < flux_PGI < %4d" % (pgi.lower_bound, pgi.upper_bound))
after 1 KO: -1000 < flux_PGI < 1000
after 2 KO:     0 < flux_PGI <    0

When knocking out model genes in a context, it is reversed when leaving the context.

Quite often, one wants to make small changes to a model and evaluate the impacts of these. For example, we may want to knock-out all reactions sequentially, and see what the impact of this is on the objective function. One way of doing this would be to create a new copy of the model before each knock-out with model.copy(). However, even with small models, this is a very slow approach as models are quite complex objects. Better then would be to do the knock-out, optimizing and then manually resetting the reaction bounds before proceeding with the next reaction. Since this is such a common scenario however, cobrapy allows us to use the model as a context, to have changes reverted automatically.

[34]:
model = load_model('textbook')
for reaction in model.reactions[:5]:
    with model as model:
        reaction.knock_out()
        model.optimize()
        print('%s blocked (bounds: %s), new growth rate %f' %
              (reaction.id, str(reaction.bounds), model.objective.value))
ACALD blocked (bounds: (0, 0)), new growth rate 0.873922
ACALDt blocked (bounds: (0, 0)), new growth rate 0.873922
ACKr blocked (bounds: (0, 0)), new growth rate 0.873922
ACONTa blocked (bounds: (0, 0)), new growth rate -0.000000
ACONTb blocked (bounds: (0, 0)), new growth rate -0.000000

If we look at those knocked reactions, see that their bounds have all been reverted.

[35]:
[reaction.bounds for reaction in model.reactions[:5]]
[35]:
[(-1000.0, 1000.0),
 (-1000.0, 1000.0),
 (-1000.0, 1000.0),
 (-1000.0, 1000.0),
 (-1000.0, 1000.0)]

Nested contexts are also supported

[36]:
print('original objective: ', model.objective.expression)
with model:
    model.objective = 'ATPM'
    print('print objective in first context:', model.objective.expression)
    with model:
        model.objective = 'ACALD'
        print('print objective in second context:', model.objective.expression)
    print('objective after exiting second context:',
          model.objective.expression)
print('back to original objective:', model.objective.expression)
original objective:  1.0*Biomass_Ecoli_core - 1.0*Biomass_Ecoli_core_reverse_2cdba
print objective in first context: 1.0*ATPM - 1.0*ATPM_reverse_5b752
print objective in second context: 1.0*ACALD - 1.0*ACALD_reverse_fda2b
objective after exiting second context: 1.0*ATPM - 1.0*ATPM_reverse_5b752
back to original objective: 1.0*Biomass_Ecoli_core - 1.0*Biomass_Ecoli_core_reverse_2cdba

Most methods that modify the model are supported like this including adding and removing reactions and metabolites and setting the objective. Supported methods and functions mention this in the corresponding documentation.

While it does not have any actual effect, for syntactic convenience it is also possible to refer to the model by a different name than outside the context. Such as

[37]:
with model as inner:
    inner.reactions.PFK.knock_out
[ ]: