MICOM is a Python package for metabolic modeling of microbial communities

Overview

https://github.com/micom-dev/micom/raw/master/docs/source/micom.png

actions status coverage pypi status

Welcome

MICOM is a Python package for metabolic modeling of microbial communities currently developed in the Gibbons Lab at the Institute for Systems Biology and the Human Systems Biology Group of Prof. Osbaldo Resendis Antonio at the National Institute of Genomic Medicine Mexico.

MICOM allows you to construct a community model from a list on input COBRA models and manages exchange fluxes between individuals and individuals with the environment. It explicitly accounts for different abundances of individuals in the community and can thus incorporate data from biomass quantification, cytometry, ampliconsequencing, or metagenomic shotgun sequencing.

It identifies a relevant flux space by incorporating an ecological model for the trade-off between individual taxa growth and community-wide growth that shows good agreement with experimental data.

Attribution

MICOM is published in

MICOM: Metagenome-Scale Modeling To Infer Metabolic Interactions in the Gut Microbiota
Christian Diener, Sean M. Gibbons, Osbaldo Resendis-Antonio
mSystems 5:e00606-19
https://doi.org/10.1128/mSystems.00606-19

Please cite this publication when referencing MICOM. Thanks 😄

Installation

MICOM is available on PyPi and can be installed via

pip install micom

Getting started

Documentation can be found at https://micom-dev.github.io/micom .

Getting help

General questions on usage can be asked in Github Discussions
https://github.com/micom-dev/micom/discussions
We are also available on the cobrapy Gitter channel
https://gitter.im/opencobra/cobrapy
Questions specific to the MICOM Qiime2 plugin (q2-micom) can also be asked on the Qiime2 forum
https://forum.qiime2.org/c/community-plugin-support/
Comments
  • exchanges not identified correctly in CarveME models

    exchanges not identified correctly in CarveME models

    Goodmorning. I have recently used your program to analyze metabolic exchanges in a community. I have only a question: in the output, I have for each metabolite two columns one called, for example, EX_but_e and the other EX_but_m. The EX_but_m appears only in one row which is the medium one. I supposed that summing the EX_but_e for all the microbes in my community I should have got the value in the medium row at EX_but_m column. But it is not the case. Why?

    Sincerely, Arianna Basile

    opened by arianccbasile 14
  • interpretation of plot_exchanges_per_sample default behavior

    interpretation of plot_exchanges_per_sample default behavior

    The "Plotting consumed metabolites" section of the documentation describes the function plot_exchanges_per_sample with default parameter direction='import' as plotting the metabolites consumed by the microbial community.

    In the source code for plot_exchanges_per_sample, the subset with taxon='medium' is selected. But isn't the "import" for the 'medium' actually the export for the microbial community? This seems at odds with the description of plot_exchanges_per_sample(direction='import') as plotting the consumption patterns of the microbes.

    Perhaps I am misinterpreting the direction of the fluxes for the 'medium' in the output of the grow function?

    opened by mmp3 6
  • optimize_all(fluxes=True) and optimize_single(fluxes=True) do not return fluxes

    optimize_all(fluxes=True) and optimize_single(fluxes=True) do not return fluxes

    Even with fluxes=True passed into community.optimize_all(), only maximal growth rates are returned. In fact, community.optimize_single() does not accept the fluxes argument.

    Using version '0.22.6'

    opened by michaelsilverstein 6
  • Fix #37 by allowing the user to define the compression...

    Fix #37 by allowing the user to define the compression...

    …type and level when building a zipped database. Default behavior stays the same (ZIP_STORED with no compression).

    This commit reworks the behavior of the compress parameter and adds the optional parameter compresslevel to micom.workflows.build_database().

    opened by nigiord 5
  • Fraction variation influences

    Fraction variation influences "status"

    Hi! I have a question about the flag "status" which comes as an output when "cooperative_tradeoff" is run. In particular, sometimes I get as a result "optimal" but most of times I get "numeric". What does it mean?

    Sincerely, Arianna Basile

    opened by arianccbasile 5
  • Optimize community growth to match known relative abundance

    Optimize community growth to match known relative abundance

    I'm new to MICOM and COBRA, and I'd like to use MICOM to predict the metabolites similar to MAMBO and the Garza et al. 2018 paper's approach which, as I understand it, is to maximize the correlation of microbial growth with the known relative abundances. Is this possible with MICOM?

    I'd guess this would involve changing the community objective function somehow, or are alternate objective functions already supported?

    Thanks!

    opened by krcurtis 5
  • Add basic installation instructions for QP solvers to the docs

    Add basic installation instructions for QP solvers to the docs

    I'm trying to run com.cooperative_tradeoff on a community using some sbml files but cannot because I'm using the default GLPK and don't know how to use IBM Cplex instead. I downloaded it as an academic edition but do not know how to set it as the solver for micom. Thanks in advance!

    opened by jfoldi81 4
  • Error occurs when joining two models after changing their metabolite IDs

    Error occurs when joining two models after changing their metabolite IDs

    Hi, I am trying to build a community for two models and I've already changed the metabolite IDs for each model. I renamed the metabolite IDs as + such as C12145cytoplasm ('C12145' is KEGG ID and 'cytoplasm' is the name of the compartment). Firstly one thing I'd like to double-check is that do I need to change the compartment ID as well even though I didn't use the compartment ID for matching models?

    Secondly, when I tried to join the two renamed models an error occurred:

    File "/Users/wintermute/opt/anaconda3/lib/python3.7/urllib/parse.py", line 107, in <genexpr>
        return tuple(x.decode(encoding, errors) if x else '' for x in args)
    AttributeError: 'Model' object has no attribute 'decode'
    

    Here is my code:

    sc = micom.util.load_model('/Users/wintermute/OneDrive - University of Cambridge/cambridge/during_cam/Department/work/code/coculture/yeast7.6_changeID_final.xml')
    
    kp = micom.util.load_model('/Users/wintermute/OneDrive - University of Cambridge/cambridge/during_cam/Department/work/code/coculture/KP_changeID_final3.xml')
    
    community = [sc,kp]
    
    micom.util.join_models(community)
    

    Does anyone have any idea about that? Thanks.

    opened by wintermute221 4
  • Why does problems.solve() return None when status is not optimal?

    Why does problems.solve() return None when status is not optimal?

    Just curious since there is additional information about what did not work in the cobrapy Solution object that could be returned. Or does it just not make sense to return a CommunitySolution object when the optimization did not work? I'm most interested in knowing what the solver status is.

    opened by mmundy42 4
  • Adjust active demands when loading models

    Adjust active demands when loading models

    Problem description

    Custom models and some models from AGORA have active demand reactions for instance for biotin. This can make MICOM fail due to infeasible models in low nutrient settings.

    Code Sample

    See https://github.com/micom-dev/media/issues/2 .

    Suggested fix

    Adjust those demands so that the zero flux solution is included. Raise a warning or info in the logger if that is the case.

    opened by cdiener 3
  • build.py:177: FutureWarning: The default value of regex will change from True to False in a future version.

    build.py:177: FutureWarning: The default value of regex will change from True to False in a future version.

    Following your recommendation from #31 , I execute build_database as follows:

    # df has columns: 
    #     id   kingdom   phylum  class  order  family   genus     species    file
    > m = build_database( manifest = df , out_path = "/data/out" , threads = 8 )
    

    which gives warning message (but still completes successfully):

    /usr/local/lib/python3.8/dist-packages/micom/workflows/build.py:177: FutureWarning: The default value of regex will change from True to False in a future version.
      meta.index = meta[rank].str.replace("[^\\w\\_]", "_")
    Running ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━  79% 0:02:12
    

    System information:

    $ python3  -c "import micom; micom.show_versions()"
    
    System Information
    ==================
    OS                  Linux
    OS-release 5.4.0-1030-aws
    Python              3.8.5
    
    Package Versions
    ================
    cobra        0.21.0
    jinja2       2.10.1
    micom        0.22.5
    pip          20.0.2
    scikit-learn 0.24.1
    scipy         1.6.1
    setuptools   45.2.0
    symengine     0.7.2
    wheel        0.34.2
    
    opened by mmp3 3
  • Error while copying model using model.copy()

    Error while copying model using model.copy()

    Discussed in https://github.com/micom-dev/micom/discussions/69

    Originally posted by anubhavdas0907 March 8, 2022 Hello Christian,

    I was trying to copy a community model to a different variable, but I get an error. I want to manipulate a model, without changing the original one.

    Following are the details.

    from micom import load_pickle
    model = load_pickle("ERR1883210.pickle")
    model_1 = model.copy()
    
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/data/Conda_base/envs/MyConda/lib/python3.7/site-packages/cobra/core/model.py", line 321, in copy
        new = self.__class__()
    TypeError: __init__() missing 1 required positional argument: 'taxonomy'
    

    Can you please suggest, what's wrong in this code, and what can be the solution?

    Regards Anubhav

    opened by cdiener 2
  • Improve the warning for media metabolites with a missing transport reaction

    Improve the warning for media metabolites with a missing transport reaction

    Checklist

    Is your feature related to a problem? Please describe it.

    The warning regarding media components with missing imports is a bit too fatalistic given that it is not a real problem in most cases. See #63 for instance.

    Describe the solution you would like.

    Maybe change it to be a an info and only trigger a warning if a large number of metabolites can not be consumed or if there are no consumable carbon and nitrogen sources.

    Describe alternatives you considered

    Just changing the text of the warning, but it may still be too verbose.

    Additional context

    opened by cdiener 0
  • Update and fix the docs

    Update and fix the docs

    Checklist

    Bundled action items

    This is issue is a collection of small things that should be fixed in the docs.

    • [ ] fix links in the docs
    • [ ] provide a section summarizing the available DBs and media
    • [ ] Improve the workflows section and maybe split it
    • [ ] give more info and the tradeoff parameter and how to choose it
    • [ ] update the theoretical/methods intro
    • [ ] document the QIIME2 artifact readers
    • [ ] update the docs for elasticities
    • [x] add solver installation to docs
    opened by cdiener 0
  • Support for GTDB taxonomy?

    Support for GTDB taxonomy?

    Checklist

    Is your feature related to a problem? Please describe it.

    The Genome Taxonomy Database (GTDB) is comprehensive (especially the new v202 release) and more robust than the NCBI microbial taxonomy, especially given that the GTDB taxonomy is completely based off of genome phylogenic relatedness.

    Although the MICOM docs are vague about the taxonomy that one must use, it appears that the NCBI taxonomy is required.

    Describe the solution you would like.

    Provide direct support for the GTDB taxonomy.

    opened by nick-youngblut 3
  • [MICOM 1.0 API] Proposed new format for fluxes

    [MICOM 1.0 API] Proposed new format for fluxes

    This is a proposal for a new format for fluxes slated for MICOM 1.0. Feel free to comment :smile:

    Checklist

    Current state

    The current format for fluxes returned by MICOM is a table in wide format:

    In [1]: from micom import Community
    
    In [2]: from micom.data import test_taxonomy
    
    In [3]: com = Community(test_taxonomy())
    Building ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
    
    In [7]: sol = com.cooperative_tradeoff(fluxes=True)
    
    In [8]: sol.fluxes
    Out[8]: 
    reaction               ACALD    ACALDt      ACKr    ACONTa    ACONTb     ACt2r          ADK1  ...     SUCDi    SUCOAS      TALA          THD2      TKT1      TKT2       TPI
    compartment                                                                                   ...                                                                          
    Escherichia_coli_1  0.049190 -0.008897 -0.004224  5.999485  5.999485 -0.004224  3.388665e-11  ...  5.017641 -5.017641  1.489184  1.924736e-10  1.489184  1.173698  7.513137
    Escherichia_coli_2 -0.079989 -0.115231  0.072559  6.001066  6.001066  0.072559  4.264225e-11  ...  5.033051 -5.033051  1.491048  1.924125e-10  1.491048  1.175562  7.495742
    Escherichia_coli_3  0.102350  0.197394 -0.100513  6.004985  6.004985 -0.100513  3.662292e-11  ...  5.083935 -5.083935  1.506075  1.926208e-10  1.506075  1.190589  7.460396
    Escherichia_coli_4 -0.071551 -0.073266  0.032177  6.023463  6.023463  0.032177  4.133342e-11  ...  5.122875 -5.122875  1.501628  1.926284e-10  1.501628  1.186143  7.440253
    medium                   NaN       NaN       NaN       NaN       NaN       NaN           NaN  ...       NaN       NaN       NaN           NaN       NaN       NaN       NaN
    
    [5 rows x 115 columns]
    
    

    This has resulted in some issues:

    1. It is incompatible with cobra.Solution.fluxes which breaks a lot of the cobra functionality like for instance summary methods.
    2. It can be pretty sparse for very divergent models (many NA entries)
    3. It mixes medium and taxa fluxes
    4. It does not specify if export fluxes denote import or export which is one of the most common help requests we receive
    5. Basically all methods using flux results in MICOM will convert them to a long format

    Proposed new API for fluxes

    CommunitySolution.fluxes will retain the cobrapy format and will superseded by new accessors that all return fluxes in long format:

    CommunitySolution.exchange_fluxes

    Similar to the previous one but with the taxa annotated.

          reaction                     name               taxon          flux direction                       micom_id
    0      EX_ac_m     ac_m medium exchange              medium  1.814984e-11    export                        EX_ac_m
    1   EX_acald_m  acald_m medium exchange              medium  1.328645e-11    export                     EX_acald_m
    2     EX_akg_m    akg_m medium exchange              medium  3.225128e-12    export                       EX_akg_m
    3     EX_co2_m    co2_m medium exchange              medium  2.280983e+01    export                       EX_co2_m
    4    EX_etoh_m   etoh_m medium exchange              medium  1.515389e-11    export                      EX_etoh_m
    ..         ...                      ...                 ...           ...       ...                           
    

    CommunitySolution.internal_fluxes

        reaction                                               name               taxon          flux                    micom_id
    0      ACALD           Acetaldehyde dehydrogenase (acetylating)  Escherichia_coli_1  1.312146e+00   ACALD__Escherichia_coli_1
    1     ACALDt                  Acetaldehyde reversible transport  Escherichia_coli_1  3.236132e+00  ACALDt__Escherichia_coli_1
    2       ACKr                                     Acetate kinase  Escherichia_coli_1 -1.304078e+00    ACKr__Escherichia_coli_1
    3     ACONTa   Aconitase (half-reaction A, Citrate hydro-lyase)  Escherichia_coli_1  5.987675e+00  ACONTa__Escherichia_coli_1
    4     ACONTb  Aconitase (half-reaction B, Isocitrate hydro-l...  Escherichia_coli_1  5.987675e+00  ACONTb__Escherichia_coli_1
    

    This will consolidate GrowthResults and CommunitySolution and gives a more readable format. All those properties are generated on the fly when accessing the property.

    Additionaly, we may also want to save the annotations in the solution but they may be large, so it might be better to have a property on the model class like Community.annotations.

    Additional context

    A similar format change is planned for Community.knockout_taxa. elasticities already uses a long format.

    feature 
    opened by cdiener 0
  • Implement more checks and help for model tables

    Implement more checks and help for model tables

    The format for the taxonomy table community model manifests can be unclear and both are often confused for one another. Provide a validation helper and better error messages.

    opened by cdiener 0
Releases(v0.32.3)
Owner
Developers of the microbial community modeling package micom.
Predicting Baseball Metric Clusters: Clustering Application in Python Using scikit-learn

Clustering Clustering Application in Python Using scikit-learn This repository contains the prediction of baseball metric clusters using MLB Statcast

Tom Weichle 2 Apr 18, 2022
A webpage that utilizes machine learning to extract sentiments from tweets.

Tweets_Classification_Webpage The goal of this project is to be able to predict what rating customers on social media platforms would give to products

Ayaz Nakhuda 1 Dec 30, 2021
Scikit-Garden or skgarden is a garden for Scikit-Learn compatible decision trees and forests.

Scikit-Garden or skgarden (pronounced as skarden) is a garden for Scikit-Learn compatible decision trees and forests.

260 Dec 21, 2022
Primitives for machine learning and data science.

An Open Source Project from the Data to AI Lab, at MIT MLPrimitives Pipelines and primitives for machine learning and data science. Documentation: htt

MLBazaar 65 Dec 29, 2022
PLUR is a collection of source code datasets suitable for graph-based machine learning.

PLUR (Programming-Language Understanding and Repair) is a collection of source code datasets suitable for graph-based machine learning. We provide scripts for downloading, processing, and loading the

Google Research 76 Nov 25, 2022
Extended Isolation Forest for Anomaly Detection

Table of contents Extended Isolation Forest Summary Motivation Isolation Forest Extension The Code Installation Requirements Use Citation Releases Ext

Sahand Hariri 377 Dec 18, 2022
A Streamlit demo to interactively visualize Uber pickups in New York City

Streamlit Demo: Uber Pickups in New York City A Streamlit demo written in pure Python to interactively visualize Uber pickups in New York City. View t

Streamlit 230 Dec 28, 2022
30 Days Of Machine Learning Using Pytorch

Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

Mayur 119 Nov 24, 2022
ParaMonte is a serial/parallel library of Monte Carlo routines for sampling mathematical objective functions of arbitrary-dimensions

ParaMonte is a serial/parallel library of Monte Carlo routines for sampling mathematical objective functions of arbitrary-dimensions, in particular, the posterior distributions of Bayesian models in

Computational Data Science Lab 182 Dec 31, 2022
A machine learning model for Covid case prediction

CovidcasePrediction A machine learning model for Covid case prediction Problem Statement Using regression algorithms we can able to track the active c

VijayAadhithya2019rit 1 Feb 02, 2022
Credit Card Fraud Detection, used the credit card fraud dataset from Kaggle

Credit Card Fraud Detection, used the credit card fraud dataset from Kaggle

Sean Zahller 1 Feb 04, 2022
Dragonfly is an open source python library for scalable Bayesian optimisation.

Dragonfly is an open source python library for scalable Bayesian optimisation. Bayesian optimisation is used for optimising black-box functions whose

744 Jan 02, 2023
Metric learning algorithms in Python

metric-learn: Metric Learning in Python metric-learn contains efficient Python implementations of several popular supervised and weakly-supervised met

1.3k Dec 28, 2022
MIT-Machine Learning with Python–From Linear Models to Deep Learning

MIT-Machine Learning with Python–From Linear Models to Deep Learning | One of the 5 courses in MIT MicroMasters in Statistics & Data Science Welcome t

2 Aug 23, 2022
Bottleneck a collection of fast, NaN-aware NumPy array functions written in C.

Bottleneck Bottleneck is a collection of fast, NaN-aware NumPy array functions written in C. As one example, to check if a np.array has any NaNs using

Python for Data 835 Dec 27, 2022
Arquivos do curso online sobre a estatística voltada para ciência de dados e aprendizado de máquina.

Estatistica para Ciência de Dados e Machine Learning Arquivos do curso online sobre a estatística voltada para ciência de dados e aprendizado de máqui

Renan Barbosa 1 Jan 10, 2022
A single Python file with some tools for visualizing machine learning in the terminal.

Machine Learning Visualization Tools A single Python file with some tools for visualizing machine learning in the terminal. This demo is composed of t

Bram Wasti 35 Dec 29, 2022
AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications.

AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications. With just a few lines of code, you can train and deploy high-accuracy m

Robin 55 Dec 27, 2022
Repositório para o #alurachallengedatascience1

1° Challenge de Dados - Alura A Alura Voz é uma empresa de telecomunicação que nos contratou para atuar como cientistas de dados na equipe de vendas.

Sthe Monica 16 Nov 10, 2022