{ "cells": [ { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "# Building a Model\n", "## Model, Reactions and Metabolites" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "This simple example demonstrates how to create a model, create a reaction, and then add the reaction to the model.\n", "\n", "We'll use the '3OAS140' reaction from the STM_1.0 model:\n", "\n", "1.0 malACP[c] + 1.0 h[c] + 1.0 ddcaACP[c] $\\rightarrow$ 1.0 co2[c] + 1.0 ACP[c] + 1.0 3omrsACP[c]\n", "\n", "First, create the model and reaction." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "deletable": true, "editable": true }, "outputs": [], "source": [ "from cobra import Model, Reaction, Metabolite" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "model = Model('example_model')\n", "\n", "reaction = Reaction('R_3OAS140')\n", "reaction.name = '3 oxoacyl acyl carrier protein synthase n C140 '\n", "reaction.subsystem = 'Cell Envelope Biosynthesis'\n", "reaction.lower_bound = 0. # This is the default\n", "reaction.upper_bound = 1000. # This is the default" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "We need to create metabolites as well. If we were using an existing model, we could use `Model.get_by_id` to get the appropriate Metabolite objects instead." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "ACP_c = Metabolite(\n", " 'ACP_c',\n", " formula='C11H21N2O7PRS',\n", " name='acyl-carrier-protein',\n", " compartment='c')\n", "omrsACP_c = Metabolite(\n", " 'M3omrsACP_c',\n", " formula='C25H45N2O9PRS',\n", " name='3-Oxotetradecanoyl-acyl-carrier-protein',\n", " compartment='c')\n", "co2_c = Metabolite('co2_c', formula='CO2', name='CO2', compartment='c')\n", "malACP_c = Metabolite(\n", " 'malACP_c',\n", " formula='C14H22N2O10PRS',\n", " name='Malonyl-acyl-carrier-protein',\n", " compartment='c')\n", "h_c = Metabolite('h_c', formula='H', name='H', compartment='c')\n", "ddcaACP_c = Metabolite(\n", " 'ddcaACP_c',\n", " formula='C23H43N2O8PRS',\n", " name='Dodecanoyl-ACP-n-C120ACP',\n", " compartment='c')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Side note: SId**\n", "\n", "It is highly recommended that the ids for reactions, metabolites and genes are valid SBML identifiers (`SId`). \n", "`SId` is a data type derived from the basic XML typestring, but with restrictions about the characters \n", "permitted and the sequences in which those characters may appear. \n", "```\n", "letter ::= ’a’..’z’,’A’..’Z’\n", "digit ::= ’0’..’9’\n", "idChar ::= letter | digit | ’_’\n", "SId ::= ( letter | ’_’ ) idChar*\n", "```\n", "The main limitation is that ids cannot start with numbers. Using `SId`s allows serialization to SBML. In addition\n", "features such as code completion and object access via the dot syntax will work in `cobrapy`." ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Adding metabolites to a reaction uses a dictionary of the metabolites and their stoichiometric coefficients. A group of metabolites can be added all at once, or they can be added one at a time." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "'ddcaACP_c + h_c + malACP_c --> ACP_c + M3omrsACP_c + co2_c'" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "reaction.add_metabolites({\n", " malACP_c: -1.0,\n", " h_c: -1.0,\n", " ddcaACP_c: -1.0,\n", " co2_c: 1.0,\n", " ACP_c: 1.0,\n", " omrsACP_c: 1.0\n", "})\n", "\n", "reaction.reaction # This gives a string representation of the reaction" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "The gene_reaction_rule is a boolean representation of the gene requirements for this reaction to be active as described in [Schellenberger et al 2011 Nature Protocols 6(9):1290-307](http://dx.doi.org/doi:10.1038/nprot.2011.308). We will assign the gene reaction rule string, which will automatically create the corresponding gene objects." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "frozenset({, })" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "reaction.gene_reaction_rule = '( STM2378 or STM1197 )'\n", "reaction.genes" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "At this point in time, the model is still empty" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 reactions initially\n", "0 metabolites initially\n", "0 genes initially\n" ] } ], "source": [ "print(f'{len(model.reactions)} reactions initially')\n", "print(f'{len(model.metabolites)} metabolites initially')\n", "print(f'{len(model.genes)} genes initially')" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "We will add the reaction to the model, which will also add all associated metabolites and genes" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 reactions\n", "6 metabolites\n", "2 genes\n" ] } ], "source": [ "model.add_reactions([reaction])\n", "\n", "# The objects have been added to the model\n", "print(f'{len(model.reactions)} reactions')\n", "print(f'{len(model.metabolites)} metabolites')\n", "print(f'{len(model.genes)} genes')" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "We can iterate through the model objects to observe the contents" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Reactions\n", "---------\n", "R_3OAS140 : ddcaACP_c + h_c + malACP_c --> ACP_c + M3omrsACP_c + co2_c\n", "\n", "Metabolites\n", "-----------\n", " malACP_c : C14H22N2O10PRS\n", " h_c : H\n", "ddcaACP_c : C23H43N2O8PRS\n", " co2_c : CO2\n", " ACP_c : C11H21N2O7PRS\n", "M3omrsACP_c : C25H45N2O9PRS\n", "\n", "Genes\n", "-----\n", "STM2378 is associated with reactions: {R_3OAS140}\n", "STM1197 is associated with reactions: {R_3OAS140}\n" ] } ], "source": [ "# Iterate through the the objects in the model\n", "print(\"Reactions\")\n", "print(\"---------\")\n", "for x in model.reactions:\n", " print(\"%s : %s\" % (x.id, x.reaction))\n", "\n", "print(\"\")\n", "print(\"Metabolites\")\n", "print(\"-----------\")\n", "for x in model.metabolites:\n", " print('%9s : %s' % (x.id, x.formula))\n", "\n", "print(\"\")\n", "print(\"Genes\")\n", "print(\"-----\")\n", "for x in model.genes:\n", " associated_ids = (i.id for i in x.reactions)\n", " print(\"%s is associated with reactions: %s\" %\n", " (x.id, \"{\" + \", \".join(associated_ids) + \"}\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Objective" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Last we need to set the objective of the model. Here, we just want this to be the maximization of the flux in the single reaction we added and we do this by assigning the reaction's identifier to the `objective` property of the model." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "model.objective = 'R_3OAS140'" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "The created objective is a symbolic algebraic expression and we can examine it by printing it" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1.0*R_3OAS140 - 1.0*R_3OAS140_reverse_60acb\n", "max\n" ] } ], "source": [ "print(model.objective.expression)\n", "print(model.objective.direction)" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "which here shows that the solver will maximize the flux in the forward direction." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model Validation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For exchange with other tools you can validate and export the model to SBML.\n", "For more information on serialization and available formats see the section \"Reading and Writing Models\"" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(,\n", " {'COBRA_CHECK': [],\n", " 'COBRA_ERROR': [],\n", " 'COBRA_FATAL': [],\n", " 'COBRA_WARNING': [],\n", " 'SBML_ERROR': [],\n", " 'SBML_FATAL': [],\n", " 'SBML_SCHEMA_ERROR': [],\n", " 'SBML_WARNING': []})\n" ] } ], "source": [ "import tempfile\n", "from pprint import pprint\n", "from cobra.io import write_sbml_model, validate_sbml_model\n", "with tempfile.NamedTemporaryFile(suffix='.xml') as f_sbml:\n", " write_sbml_model(model, filename=f_sbml.name)\n", " report = validate_sbml_model(filename=f_sbml.name)\n", "\n", "pprint(report)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The model is valid with no COBRA or SBML errors or warnings." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exchanges, Sinks and Demands\n", "Boundary reactions can be added using the model's method `add_boundary`.\n", "There are three different types of pre-defined boundary reactions: exchange, demand, and sink reactions. All of them are unbalanced pseudo reactions, that means they fulfill a function for modeling by adding to or removing metabolites from the model system but are not based on real biology. An exchange reaction is a reversible reaction that adds to or removes an extracellular metabolite from the extracellular compartment. A demand reaction is an irreversible reaction that consumes an intracellular metabolite. A sink is similar to an exchange but specifically for intracellular metabolites, i.e., a reversible reaction that adds or removes an intracellular metabolite." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "There are no boundary reactions in this model. Therefore specific types of boundary reactions such as 'exchanges', 'demands' or 'sinks' cannot be identified.\n", "There are no boundary reactions in this model. Therefore specific types of boundary reactions such as 'exchanges', 'demands' or 'sinks' cannot be identified.\n", "There are no boundary reactions in this model. Therefore specific types of boundary reactions such as 'exchanges', 'demands' or 'sinks' cannot be identified.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "exchanges []\n", "demands []\n", "sinks []\n" ] } ], "source": [ "print(\"exchanges\", model.exchanges)\n", "print(\"demands\", model.demands)\n", "print(\"sinks\", model.sinks)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Boundary reactions are defined on metabolites. First we add two metabolites to the model then\n", "we define the boundary reactions. We add glycogen to the cytosolic compartment `c` and CO2 to the external compartment `e`." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "model.add_metabolites([\n", " Metabolite(\n", " 'glycogen_c',\n", " name='glycogen',\n", " compartment='c'\n", " ),\n", " Metabolite(\n", " 'co2_e',\n", " name='CO2',\n", " compartment='e'\n", " ),\n", "])" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Reaction identifierEX_co2_e
NameCO2 exchange
Memory address0x07fef04703d90
Stoichiometry\n", "

co2_e <=>

\n", "

CO2 <=>

\n", "
GPR
Lower bound-1000.0
Upper bound1000.0
\n", " " ], "text/plain": [ "" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# create exchange reaction\n", "model.add_boundary(model.metabolites.get_by_id(\"co2_e\"), type=\"exchange\")" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Reaction identifierSK_glycogen_c
Nameglycogen sink
Memory address0x07fef046eb2e0
Stoichiometry\n", "

glycogen_c <=>

\n", "

glycogen <=>

\n", "
GPR
Lower bound-1000.0
Upper bound1000.0
\n", " " ], "text/plain": [ "" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# create exchange reaction\n", "model.add_boundary(model.metabolites.get_by_id(\"glycogen_c\"), type=\"sink\")" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "exchanges []\n", "sinks []\n", "demands []\n" ] } ], "source": [ "# Now we have an additional exchange and sink reaction in the model\n", "print(\"exchanges\", model.exchanges)\n", "print(\"sinks\", model.sinks)\n", "print(\"demands\", model.demands)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To create a demand reaction instead of a sink use type `demand` instead of `sink`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Information on all boundary reactions is available via the model's property `boundary`." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[,\n", " ]" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# boundary reactions\n", "model.boundary" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A neat trick to get all metabolic reactions is" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{}" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# metabolic reactions\n", "set(model.reactions) - set(model.boundary)" ] } ], "metadata": { "kernelspec": { "display_name": "cobrapy", "language": "python", "name": "cobrapy" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }