MICOM is a Python package for metabolic modeling of microbial communities

Overview

https://github.com/micom-dev/micom/raw/master/docs/source/micom.png

actions status coverage pypi status

Welcome

MICOM is a Python package for metabolic modeling of microbial communities currently developed in the Gibbons Lab at the Institute for Systems Biology and the Human Systems Biology Group of Prof. Osbaldo Resendis Antonio at the National Institute of Genomic Medicine Mexico.

MICOM allows you to construct a community model from a list on input COBRA models and manages exchange fluxes between individuals and individuals with the environment. It explicitly accounts for different abundances of individuals in the community and can thus incorporate data from biomass quantification, cytometry, ampliconsequencing, or metagenomic shotgun sequencing.

It identifies a relevant flux space by incorporating an ecological model for the trade-off between individual taxa growth and community-wide growth that shows good agreement with experimental data.

Attribution

MICOM is published in

MICOM: Metagenome-Scale Modeling To Infer Metabolic Interactions in the Gut Microbiota
Christian Diener, Sean M. Gibbons, Osbaldo Resendis-Antonio
mSystems 5:e00606-19
https://doi.org/10.1128/mSystems.00606-19

Please cite this publication when referencing MICOM. Thanks 😄

Installation

MICOM is available on PyPi and can be installed via

pip install micom

Getting started

Documentation can be found at https://micom-dev.github.io/micom .

Getting help

General questions on usage can be asked in Github Discussions
https://github.com/micom-dev/micom/discussions
We are also available on the cobrapy Gitter channel
https://gitter.im/opencobra/cobrapy
Questions specific to the MICOM Qiime2 plugin (q2-micom) can also be asked on the Qiime2 forum
https://forum.qiime2.org/c/community-plugin-support/
Comments
  • exchanges not identified correctly in CarveME models

    exchanges not identified correctly in CarveME models

    Goodmorning. I have recently used your program to analyze metabolic exchanges in a community. I have only a question: in the output, I have for each metabolite two columns one called, for example, EX_but_e and the other EX_but_m. The EX_but_m appears only in one row which is the medium one. I supposed that summing the EX_but_e for all the microbes in my community I should have got the value in the medium row at EX_but_m column. But it is not the case. Why?

    Sincerely, Arianna Basile

    opened by arianccbasile 14
  • interpretation of plot_exchanges_per_sample default behavior

    interpretation of plot_exchanges_per_sample default behavior

    The "Plotting consumed metabolites" section of the documentation describes the function plot_exchanges_per_sample with default parameter direction='import' as plotting the metabolites consumed by the microbial community.

    In the source code for plot_exchanges_per_sample, the subset with taxon='medium' is selected. But isn't the "import" for the 'medium' actually the export for the microbial community? This seems at odds with the description of plot_exchanges_per_sample(direction='import') as plotting the consumption patterns of the microbes.

    Perhaps I am misinterpreting the direction of the fluxes for the 'medium' in the output of the grow function?

    opened by mmp3 6
  • optimize_all(fluxes=True) and optimize_single(fluxes=True) do not return fluxes

    optimize_all(fluxes=True) and optimize_single(fluxes=True) do not return fluxes

    Even with fluxes=True passed into community.optimize_all(), only maximal growth rates are returned. In fact, community.optimize_single() does not accept the fluxes argument.

    Using version '0.22.6'

    opened by michaelsilverstein 6
  • Fix #37 by allowing the user to define the compression...

    Fix #37 by allowing the user to define the compression...

    …type and level when building a zipped database. Default behavior stays the same (ZIP_STORED with no compression).

    This commit reworks the behavior of the compress parameter and adds the optional parameter compresslevel to micom.workflows.build_database().

    opened by nigiord 5
  • Fraction variation influences

    Fraction variation influences "status"

    Hi! I have a question about the flag "status" which comes as an output when "cooperative_tradeoff" is run. In particular, sometimes I get as a result "optimal" but most of times I get "numeric". What does it mean?

    Sincerely, Arianna Basile

    opened by arianccbasile 5
  • Optimize community growth to match known relative abundance

    Optimize community growth to match known relative abundance

    I'm new to MICOM and COBRA, and I'd like to use MICOM to predict the metabolites similar to MAMBO and the Garza et al. 2018 paper's approach which, as I understand it, is to maximize the correlation of microbial growth with the known relative abundances. Is this possible with MICOM?

    I'd guess this would involve changing the community objective function somehow, or are alternate objective functions already supported?

    Thanks!

    opened by krcurtis 5
  • Add basic installation instructions for QP solvers to the docs

    Add basic installation instructions for QP solvers to the docs

    I'm trying to run com.cooperative_tradeoff on a community using some sbml files but cannot because I'm using the default GLPK and don't know how to use IBM Cplex instead. I downloaded it as an academic edition but do not know how to set it as the solver for micom. Thanks in advance!

    opened by jfoldi81 4
  • Error occurs when joining two models after changing their metabolite IDs

    Error occurs when joining two models after changing their metabolite IDs

    Hi, I am trying to build a community for two models and I've already changed the metabolite IDs for each model. I renamed the metabolite IDs as + such as C12145cytoplasm ('C12145' is KEGG ID and 'cytoplasm' is the name of the compartment). Firstly one thing I'd like to double-check is that do I need to change the compartment ID as well even though I didn't use the compartment ID for matching models?

    Secondly, when I tried to join the two renamed models an error occurred:

    File "/Users/wintermute/opt/anaconda3/lib/python3.7/urllib/parse.py", line 107, in <genexpr>
        return tuple(x.decode(encoding, errors) if x else '' for x in args)
    AttributeError: 'Model' object has no attribute 'decode'
    

    Here is my code:

    sc = micom.util.load_model('/Users/wintermute/OneDrive - University of Cambridge/cambridge/during_cam/Department/work/code/coculture/yeast7.6_changeID_final.xml')
    
    kp = micom.util.load_model('/Users/wintermute/OneDrive - University of Cambridge/cambridge/during_cam/Department/work/code/coculture/KP_changeID_final3.xml')
    
    community = [sc,kp]
    
    micom.util.join_models(community)
    

    Does anyone have any idea about that? Thanks.

    opened by wintermute221 4
  • Why does problems.solve() return None when status is not optimal?

    Why does problems.solve() return None when status is not optimal?

    Just curious since there is additional information about what did not work in the cobrapy Solution object that could be returned. Or does it just not make sense to return a CommunitySolution object when the optimization did not work? I'm most interested in knowing what the solver status is.

    opened by mmundy42 4
  • Adjust active demands when loading models

    Adjust active demands when loading models

    Problem description

    Custom models and some models from AGORA have active demand reactions for instance for biotin. This can make MICOM fail due to infeasible models in low nutrient settings.

    Code Sample

    See https://github.com/micom-dev/media/issues/2 .

    Suggested fix

    Adjust those demands so that the zero flux solution is included. Raise a warning or info in the logger if that is the case.

    opened by cdiener 3
  • build.py:177: FutureWarning: The default value of regex will change from True to False in a future version.

    build.py:177: FutureWarning: The default value of regex will change from True to False in a future version.

    Following your recommendation from #31 , I execute build_database as follows:

    # df has columns: 
    #     id   kingdom   phylum  class  order  family   genus     species    file
    > m = build_database( manifest = df , out_path = "/data/out" , threads = 8 )
    

    which gives warning message (but still completes successfully):

    /usr/local/lib/python3.8/dist-packages/micom/workflows/build.py:177: FutureWarning: The default value of regex will change from True to False in a future version.
      meta.index = meta[rank].str.replace("[^\\w\\_]", "_")
    Running ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━  79% 0:02:12
    

    System information:

    $ python3  -c "import micom; micom.show_versions()"
    
    System Information
    ==================
    OS                  Linux
    OS-release 5.4.0-1030-aws
    Python              3.8.5
    
    Package Versions
    ================
    cobra        0.21.0
    jinja2       2.10.1
    micom        0.22.5
    pip          20.0.2
    scikit-learn 0.24.1
    scipy         1.6.1
    setuptools   45.2.0
    symengine     0.7.2
    wheel        0.34.2
    
    opened by mmp3 3
  • Error while copying model using model.copy()

    Error while copying model using model.copy()

    Discussed in https://github.com/micom-dev/micom/discussions/69

    Originally posted by anubhavdas0907 March 8, 2022 Hello Christian,

    I was trying to copy a community model to a different variable, but I get an error. I want to manipulate a model, without changing the original one.

    Following are the details.

    from micom import load_pickle
    model = load_pickle("ERR1883210.pickle")
    model_1 = model.copy()
    
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/data/Conda_base/envs/MyConda/lib/python3.7/site-packages/cobra/core/model.py", line 321, in copy
        new = self.__class__()
    TypeError: __init__() missing 1 required positional argument: 'taxonomy'
    

    Can you please suggest, what's wrong in this code, and what can be the solution?

    Regards Anubhav

    opened by cdiener 2
  • Improve the warning for media metabolites with a missing transport reaction

    Improve the warning for media metabolites with a missing transport reaction

    Checklist

    Is your feature related to a problem? Please describe it.

    The warning regarding media components with missing imports is a bit too fatalistic given that it is not a real problem in most cases. See #63 for instance.

    Describe the solution you would like.

    Maybe change it to be a an info and only trigger a warning if a large number of metabolites can not be consumed or if there are no consumable carbon and nitrogen sources.

    Describe alternatives you considered

    Just changing the text of the warning, but it may still be too verbose.

    Additional context

    opened by cdiener 0
  • Update and fix the docs

    Update and fix the docs

    Checklist

    Bundled action items

    This is issue is a collection of small things that should be fixed in the docs.

    • [ ] fix links in the docs
    • [ ] provide a section summarizing the available DBs and media
    • [ ] Improve the workflows section and maybe split it
    • [ ] give more info and the tradeoff parameter and how to choose it
    • [ ] update the theoretical/methods intro
    • [ ] document the QIIME2 artifact readers
    • [ ] update the docs for elasticities
    • [x] add solver installation to docs
    opened by cdiener 0
  • Support for GTDB taxonomy?

    Support for GTDB taxonomy?

    Checklist

    Is your feature related to a problem? Please describe it.

    The Genome Taxonomy Database (GTDB) is comprehensive (especially the new v202 release) and more robust than the NCBI microbial taxonomy, especially given that the GTDB taxonomy is completely based off of genome phylogenic relatedness.

    Although the MICOM docs are vague about the taxonomy that one must use, it appears that the NCBI taxonomy is required.

    Describe the solution you would like.

    Provide direct support for the GTDB taxonomy.

    opened by nick-youngblut 3
  • [MICOM 1.0 API] Proposed new format for fluxes

    [MICOM 1.0 API] Proposed new format for fluxes

    This is a proposal for a new format for fluxes slated for MICOM 1.0. Feel free to comment :smile:

    Checklist

    Current state

    The current format for fluxes returned by MICOM is a table in wide format:

    In [1]: from micom import Community
    
    In [2]: from micom.data import test_taxonomy
    
    In [3]: com = Community(test_taxonomy())
    Building ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
    
    In [7]: sol = com.cooperative_tradeoff(fluxes=True)
    
    In [8]: sol.fluxes
    Out[8]: 
    reaction               ACALD    ACALDt      ACKr    ACONTa    ACONTb     ACt2r          ADK1  ...     SUCDi    SUCOAS      TALA          THD2      TKT1      TKT2       TPI
    compartment                                                                                   ...                                                                          
    Escherichia_coli_1  0.049190 -0.008897 -0.004224  5.999485  5.999485 -0.004224  3.388665e-11  ...  5.017641 -5.017641  1.489184  1.924736e-10  1.489184  1.173698  7.513137
    Escherichia_coli_2 -0.079989 -0.115231  0.072559  6.001066  6.001066  0.072559  4.264225e-11  ...  5.033051 -5.033051  1.491048  1.924125e-10  1.491048  1.175562  7.495742
    Escherichia_coli_3  0.102350  0.197394 -0.100513  6.004985  6.004985 -0.100513  3.662292e-11  ...  5.083935 -5.083935  1.506075  1.926208e-10  1.506075  1.190589  7.460396
    Escherichia_coli_4 -0.071551 -0.073266  0.032177  6.023463  6.023463  0.032177  4.133342e-11  ...  5.122875 -5.122875  1.501628  1.926284e-10  1.501628  1.186143  7.440253
    medium                   NaN       NaN       NaN       NaN       NaN       NaN           NaN  ...       NaN       NaN       NaN           NaN       NaN       NaN       NaN
    
    [5 rows x 115 columns]
    
    

    This has resulted in some issues:

    1. It is incompatible with cobra.Solution.fluxes which breaks a lot of the cobra functionality like for instance summary methods.
    2. It can be pretty sparse for very divergent models (many NA entries)
    3. It mixes medium and taxa fluxes
    4. It does not specify if export fluxes denote import or export which is one of the most common help requests we receive
    5. Basically all methods using flux results in MICOM will convert them to a long format

    Proposed new API for fluxes

    CommunitySolution.fluxes will retain the cobrapy format and will superseded by new accessors that all return fluxes in long format:

    CommunitySolution.exchange_fluxes

    Similar to the previous one but with the taxa annotated.

          reaction                     name               taxon          flux direction                       micom_id
    0      EX_ac_m     ac_m medium exchange              medium  1.814984e-11    export                        EX_ac_m
    1   EX_acald_m  acald_m medium exchange              medium  1.328645e-11    export                     EX_acald_m
    2     EX_akg_m    akg_m medium exchange              medium  3.225128e-12    export                       EX_akg_m
    3     EX_co2_m    co2_m medium exchange              medium  2.280983e+01    export                       EX_co2_m
    4    EX_etoh_m   etoh_m medium exchange              medium  1.515389e-11    export                      EX_etoh_m
    ..         ...                      ...                 ...           ...       ...                           
    

    CommunitySolution.internal_fluxes

        reaction                                               name               taxon          flux                    micom_id
    0      ACALD           Acetaldehyde dehydrogenase (acetylating)  Escherichia_coli_1  1.312146e+00   ACALD__Escherichia_coli_1
    1     ACALDt                  Acetaldehyde reversible transport  Escherichia_coli_1  3.236132e+00  ACALDt__Escherichia_coli_1
    2       ACKr                                     Acetate kinase  Escherichia_coli_1 -1.304078e+00    ACKr__Escherichia_coli_1
    3     ACONTa   Aconitase (half-reaction A, Citrate hydro-lyase)  Escherichia_coli_1  5.987675e+00  ACONTa__Escherichia_coli_1
    4     ACONTb  Aconitase (half-reaction B, Isocitrate hydro-l...  Escherichia_coli_1  5.987675e+00  ACONTb__Escherichia_coli_1
    

    This will consolidate GrowthResults and CommunitySolution and gives a more readable format. All those properties are generated on the fly when accessing the property.

    Additionaly, we may also want to save the annotations in the solution but they may be large, so it might be better to have a property on the model class like Community.annotations.

    Additional context

    A similar format change is planned for Community.knockout_taxa. elasticities already uses a long format.

    feature 
    opened by cdiener 0
  • Implement more checks and help for model tables

    Implement more checks and help for model tables

    The format for the taxonomy table community model manifests can be unclear and both are often confused for one another. Provide a validation helper and better error messages.

    opened by cdiener 0
Releases(v0.32.3)
Owner
Developers of the microbial community modeling package micom.
fastFM: A Library for Factorization Machines

Citing fastFM The library fastFM is an academic project. The time and resources spent developing fastFM are therefore justified by the number of citat

1k Dec 24, 2022
Implementations of Machine Learning models, Regularizers, Optimizers and different Cost functions.

Linear Models Implementations of LinearRegression, LassoRegression and RidgeRegression with appropriate Regularizers and Optimizers. Linear Regression

Keivan Ipchi Hagh 1 Nov 22, 2021
Kalman filter library

The kalman filter framework described here is an incredibly powerful tool for any optimization problem, but particularly for visual odometry, sensor fusion localization or SLAM.

comma.ai 276 Jan 01, 2023
BudouX is the successor to Budou, the machine learning powered line break organizer tool.

BudouX Standalone. Small. Language-neutral. BudouX is the successor to Budou, the machine learning powered line break organizer tool. It is standalone

Google 868 Jan 05, 2023
Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared

Feature-Engineering Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared. When the dataset

kemalgunay 5 Apr 21, 2022
Pandas Machine Learning and Quant Finance Library Collection

Pandas Machine Learning and Quant Finance Library Collection

148 Dec 07, 2022
Kats is a toolkit to analyze time series data, a lightweight, easy-to-use, and generalizable framework to perform time series analysis.

Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics

Facebook Research 4.1k Dec 29, 2022
Short PhD seminar on Machine Learning Security (Adversarial Machine Learning)

Short PhD seminar on Machine Learning Security (Adversarial Machine Learning)

141 Dec 27, 2022
Python/Sage Tool for deriving Scattering Matrices for WDF R-Adaptors

R-Solver A Python tools for deriving R-Type adaptors for Wave Digital Filters. This code is not quite production-ready. If you are interested in contr

8 Sep 19, 2022
This repository demonstrates the usage of hover to understand and supervise a machine learning task.

Hover Example Apps (works out-of-the-box on Binder) This repository demonstrates the usage of hover to understand and supervise a machine learning tas

Pavel 43 Dec 03, 2021
STUMPY is a powerful and scalable Python library for computing a Matrix Profile, which can be used for a variety of time series data mining tasks

STUMPY STUMPY is a powerful and scalable library that efficiently computes something called the matrix profile, which can be used for a variety of tim

TD Ameritrade 2.5k Jan 06, 2023
A chain of stores, 10 different stores and 50 different requests a 3-month demand forecast for its product.

Demand-Forecasting Business Problem A chain of stores, 10 different stores and 50 different requests a 3-month demand forecast for its product.

Ayşe Nur Türkaslan 3 Mar 06, 2022
Pydantic based mock data generation

This library offers powerful mock data generation capabilities for pydantic based models. It can also be used with other libraries that use pydantic as a foundation, for example SQLModel, Beanie and

Na'aman Hirschfeld 396 Dec 28, 2022
AutoX是一个高效的自动化机器学习工具,它主要针对于表格类型的数据挖掘竞赛。 它的特点包括: 效果出色、简单易用、通用、自动化、灵活。

English | 简体中文 AutoX是什么? AutoX一个高效的自动化机器学习工具,它主要针对于表格类型的数据挖掘竞赛。 它的特点包括: 效果出色: AutoX在多个kaggle数据集上,效果显著优于其他解决方案(见效果对比)。 简单易用: AutoX的接口和sklearn类似,方便上手使用。

4Paradigm 431 Dec 28, 2022
Apple-voice-recognition - Machine Learning

Apple-voice-recognition Machine Learning How does Siri work? Siri is based on large-scale Machine Learning systems that employ many aspects of data sc

Harshith VH 1 Oct 22, 2021
Client - 🔥 A tool for visualizing and tracking your machine learning experiments

Weights and Biases Use W&B to build better models faster. Track and visualize all the pieces of your machine learning pipeline, from datasets to produ

Weights & Biases 5.2k Jan 03, 2023
Polyglot Machine Learning example for scraping similar news articles.

Polyglot Machine Learning example for scraping similar news articles In this example, we will see how we can work with Machine Learning applications w

MetaCall 15 Mar 28, 2022
Sequence learning toolkit for Python

seqlearn seqlearn is a sequence classification toolkit for Python. It is designed to extend scikit-learn and offer as similar as possible an API. Comp

Lars 653 Dec 27, 2022
Automated machine learning: Review of the state-of-the-art and opportunities for healthcare

Automated machine learning: Review of the state-of-the-art and opportunities for healthcare

42 Dec 23, 2022
XManager: A framework for managing machine learning experiments 🧑‍🔬

XManager is a platform for packaging, running and keeping track of machine learning experiments. It currently enables one to launch experiments locally or on Google Cloud Platform (GCP). Interaction

DeepMind 620 Dec 27, 2022