11. Gapfillling

Model gap filling is the task of figuring out which reactions have to be added to a model to make it feasible. Several such algorithms have been reported e.g. Kumar et al. 2009 and Reed et al. 2006. Cobrapy has a gap filling implementation that is very similar to that of Reed et al. where we use a mixed-integer linear program to figure out the smallest number of reactions that need to be added for a user-defined collection of reactions, i.e. a universal model. Briefly, the problem that we try to solve is

Minimize:

\[\sum_i c_i * z_i\]

subject to

\[Sv = 0\]
\[v^\star \geq t\]
\[l_i\leq v_i \leq u_i\]
\[v_i = 0 \textrm{ if } z_i = 0\]

Where l, u are lower and upper bounds for reaction i and z is an indicator variable that is zero if the reaction is not used and otherwise 1, c is a user-defined cost associated with using the ith reaction, \(v^\star\) is the flux of the objective and t a lower bound for that objective. To demonstrate, let’s take a model and remove some essential reactions from it.

[2]:
from cobra.io import load_model
from cobra.flux_analysis import gapfill
model = load_model("iYS1720")

In this model D-Fructose-6-phosphate is an essential metabolite. We will remove all the reactions using it, and at them to a separate model.

[3]:
import cobra
universal = cobra.Model("universal_reactions")
for i in [i.id for i in model.metabolites.f6p_c.reactions]:
    reaction = model.reactions.get_by_id(i)
    universal.add_reactions([reaction.copy()])
    model.remove_reactions([reaction])

Now, because of these gaps, the model won’t grow.

[4]:
model.optimize().objective_value
[4]:
0.0

We will use can use the model’s original objective, growth, to figure out which of the removed reactions are required for the model be feasible again. This is very similar to making the ‘no-growth but growth (NGG)’ predictions of Kumar et al. 2009.

[5]:
solution = gapfill(model, universal, demand_reactions=False)
for reaction in solution[0]:
    print(reaction.id)
TALA
GF6PTA

We can obtain multiple possible reaction sets by having the algorithm go through multiple iterations.

[6]:
result = gapfill(model, universal, demand_reactions=False, iterations=4)
for i, entries in enumerate(result):
    print("---- Run %d ----" % (i + 1))
    for e in entries:
        print(e.id)
---- Run 1 ----
TALA
GF6PTA
---- Run 2 ----
TALA
GF6PTA
---- Run 3 ----
TKT2
GF6PTA
PGI
---- Run 4 ----
TALA
GF6PTA

We can also instead of using the original objective, specify a given metabolite that we want the model to be able to produce.

[7]:
with model:
    model.objective = model.add_boundary(model.metabolites.f6p_c, type='demand')
    solution = gapfill(model, universal)
    for reaction in solution[0]:
        print(reaction.id)
PGI

Finally, note that using mixed-integer linear programming is computationally quite expensive and for larger models you may want to consider alternative gap filling methods and reconstruction methods.