Python3 command-line tool for the inference of Boolean rules and pathway analysis on omics data

Overview

BONITA-Python3

BONITA was originally written in Python 2 and tested with Python 2-compatible packages. This version of the packages ports BONITA to Python 3. Functionality remains the same. However, we refer users to the original release to reproduce figures from the BONITA paper.

BONITA- Boolean Omics Network Invariant-Time Analysis is a package for the inference of Boolean rules and pathway analysis on omics data. It can be applied to help uncover underlying relationships in biological data. Please see our publication for more information.

Authors: Rohith Palli (https://www.github.com/rpalli), Mukta G. Palshikar and Juilee Thakar

**BONITA ported to Python 3 by Mukta G. Palshikar (https://www.github.com/mgp13) and Jiayue Meng (https://www.github.com/jiayuemeng) **

For a demonstration of the BONITA pipeline, see the tutorial in Tutorials/BONITA_pipeline_tutorial.md. The instructions in the current README file cover all anticipated use cases.

Maintainer: Please contact Juilee Thakar at [email protected]

Citation

We would appreciate the citation of our manuscript describing the original BONITA release, below, for any use of our code.

Palli R, Palshikar MG, Thakar J (2019) Executable pathway analysis using ensemble discrete-state modeling for large-scale data. PLoS Comput Biol 15(9): e1007317. (https://doi.org/10.1371/journal.pcbi.1007317)

Installation

BONITA is designed for use with distributed computing systems. Necessary SLURM commands are included. If users are having trouble translating to PBS or other queueing standards for their computing environment, please contact Juilee Thakar at [email protected]

Create a conda environment to run BONITA

Use a terminal, or an Anaconda Prompt for the following:

  1. Create a conda environment using the provided YML file

conda env create –name BONITA --file platform_BONITA.yaml

  1. Activate the BONITA environment

activate BONITA

  1. Check that the BONITA environment is available and correctly installed:

conda info --envs

Install BONITA

You can download and use BONITA in one of two ways:

  1. Download a zipped folder containing all the files you need (github download link in green box above and to the right)
  2. Clone this git repository in the folder of your choice using the command

git clone https://github.com/YOUR-USERNAME/YOUR-REPOSITORY

Next, the C code must be compiled using the make file. Simply type make while in the BONITA folder. make

Now you have a fully functional distribution of BONITA! Time to gather your data and get started.

Usage

You will need the following files to run BONITA:

  • omics data as a plaintext table (csv, tsv, or similar) with the first row containing a holder for gene symbol column then sample names and subsequent rows containing gene symbol in first column and column-normalized (rpm or rpkm in transcriptomics) abundance measures in other columns.
  • gmt file with list of KEGG pathways to be considered (can be downloaded from msigdb)
  • matrix of conditions with each line representing a sample and the first column containing the names of the samples and subsequent columns describing 1/0 if the sample is part of that condition or not.
  • list of contrasts you would like to run with each contrast on a single line

There are three main steps in BONITA: prepare pathways for rule inference, rule inference, and pathway analysis. All necessary files for an example run are provided in the Tutorials folder . The preparation step requires internet access to access the KEGG API.

Step 1: Pathway preparation

See the bash script pathwayPreparation.sh for examples

This step requires internet access.

There are three ways to complete this process:

  1. on a gmt of human pathways
  2. on all KEGG pathways for any organism, or
  3. on a list of KEGG pathways for any organism

Only Option 1 was used and tested in our manuscript. Caution should be exercised in interpreting results of other two methods. At a minimum, graphmls with impact scores and relative abundance should be examined before drawing conclusions about pathway differences.

Option 1: On a gmt of human pathways

BONITA needs omics data, gmt file, and an indication of what character is used to separate columns in the file. For example, a traditional comma separated value file (csv) would need BONITA input "-sep ,". Since tab can't be passed in as easily, a -t command will automatically flag tab as the separator. The commands are below:

comma separated: python pathway_analysis_setup.py -gmt Your_gmt_file -sep , Your_omics_data

tab separated: python pathway_analysis_setup.py -t -gmt Your_gmt_file Your_omics_data

Option 2: On all KEGG pathways for any organism

BONITA needs omics data, organism code, and an indication of what character is used to separate columns in the file. For example, a traditional comma separated value file (csv) would need BONITA input "-sep ,". Since tab can't be passed in as easily, a -t command will automatically flag tab as the separator. A three letter organism code from KEGG must be provided (lower case). Example codes include mmu for mouse and hsa for human. The commands are below: comma separated: python pathway_analysis_setup.py -org Your_org_code -sep , Your_omics_data

comma separated, human: python pathway_analysis_setup.py -org hsa -sep , Your_omics_data

comma separated, mouse: python pathway_analysis_setup.py -org mmu -sep , Your_omics_data

tab separated: python pathway_analysis_setup.py -t -org Your_org_code Your_omics_data

Option 3: On a list of KEGG pathways for any organism

BONITA needs omics data, organism code, the list of pathways, and an indication of what character is used to separate columns in the file. For example, a traditional comma separated value file (csv) would need BONITA input "-sep ,". Since tab can't be passed in as easily, a -t command will automatically flag tab as the separator. A three letter organism code from KEGG must be provided (lower case). Example codes include mmu for mouse and hsa for human. The list of pathways must include the 5 digit pathway identifier, must be seperated by commas, and must not include any other numbers. An example paths.txt is included in the inputData folder. The commands are below: comma separated: python pathway_analysis_setup.py -org Your_org_code -sep , -paths Your_pathway_list Your_omics_data

comma separated, human: python pathway_analysis_setup.py -org hsa -sep , -paths Your_pathway_list Your_omics_data

comma separated, mouse: python pathway_analysis_setup.py -org mmu -sep , -paths Your_pathway_list Your_omics_data

tab separated: python pathway_analysis_setup.py -t -org Your_org_code -paths Your_pathway_list Your_omics_data

Step 2: Rule inference

Simply run the script find_rules_pathway_analysis.sh which will automatically submit appropriate jobs to SLURM queue:

bash find_rules_pathway_analysis.sh

Step 3: Pathway Analysis

To accomplish this, the proper inputs must be provided to pathway_analysis_score_pathways.py. The cleaup.sh script will automatically put output of rule inference step into correct folders.

bash cleanup.sh

Then run the pathway analysis script:

python pathway_analysis_score_pathways.py Your_omics_data Your_condition_matrix Your_desired_contrasts -sep Separator_used_in_gmt_and_omics_data

If your files are tab separated, then the following command can be used: python pathway_analysis_score_pathways.py -t Your_omics_data Your_condition_matrix Your_desired_contrasts

Owner
Thakar lab uses AI and systems biology approaches to identify immune signatures that can predict outcome of an immune response to infections or vaccinations.
A powerful, colorful, beautiful command-line-interface for pypi.org

pypi-command-line pypi-command-line is a colorful, powerful, and beautiful command line interface for pypi.org that is actively maintained Detailed Do

Wasi Master 32 Jun 23, 2022
CLTools provides various tools and command to use in the terminal.

CLTools provides various tools and command to use in the terminal. As of date, CLTools is only able to generate temporary email addresses and receive emails. There are plans to integrate more tools a

Ashwin Chugh 2 Feb 14, 2022
Easily handle day to day CLI operation via Python instead of regular Bash programs.

pz Ever wished to use Python in Bash? Would you choose the Python syntax over sed, awk, ...? Should you exactly know what command would you use in Pyt

CZ.NIC 697 Jan 03, 2023
A command line tool to remove background from video and image

A command line tool to remove background from video and image, brought to you by BackgroundRemover.app which is an app made by nadermx powered by this tool

Johnathan Nader 1.7k Jan 01, 2023
spid-sp-test is a SAML2 SPID/CIE Service Provider validation tool that can be executed from the command line.

spid-sp-test spid-sp-test is a SAML2 SPID/CIE Service Provider validation tool that can be executed from the command line. This tool was born by separ

Developers Italia 30 Nov 08, 2022
A simple web-based SSH client.

Kommander A simple web-based SSH client. It supports: entering SSH login details (including private key and custom ports) and connecting user authenti

KingWaffleIII 2 Jan 01, 2022
Quo is a Python based toolkit for writing Command-Line Interface(CLI) applications.

Quo is a Python based toolkit for writing Command-Line Interface(CLI) applications. Quo is making headway towards composing speedy and orderly CLI applications while forestalling any disappointments

Secretum Inc. 16 Oct 15, 2022
A Python module and command-line utility for converting .ANS format ANSI art to HTML

ansipants A Python module and command-line utility for converting .ANS format ANSI art to HTML. Installation pip install ansipants Command-line usage

4 Oct 16, 2022
Python command line tool and python engine to label table fields and fields in data files.

Python command line tool and python engine to label table fields and fields in data files. It could help to find meaningful data in your tables and data files or to find Personal identifable informat

APICrafter 22 Dec 05, 2022
Oil is a new Unix shell. It's our upgrade path from bash to a better language and runtime

Oil is a new Unix shell. It's our upgrade path from bash to a better language and runtime. It's also for Python and JavaScript users who avoid shell!

2.4k Jan 08, 2023
Pymongo based CLI client, to run operation on existing databases and collections

Mongodb-Operations-Console Pymongo based CLI client, to run operation on existing databases and collections Program developed by Gustavo Wydler Azuaga

Gus 1 Dec 01, 2021
Sink is a CLI tool that allows users to synchronize their local folders to their Google Drives. It is similar to the Git CLI and allows fast and reliable syncs with the drive.

Sink is a CLI synchronisation tool that enables a user to synchronise local system files and folders with their Google Drives. It follows a git C

Yash Thakre 16 May 29, 2022
Python and data science snippets on the command line

Python Snippet Tool A tool to get Python and data science snippets at Data Science Simplified on the command line. You can read my article to learn ho

Khuyen Tran 19 Dec 21, 2022
Python CLI for accessing CSCI320 PDM Database

p320_14 Python CLI for accessing CSCI320 PDM Database Authors: Aidan Mellin Dan Skigen Jacob Auger Kyle Baptiste Before running the application for th

Aidan Mellin 1 Nov 23, 2021
Stream comments, submissions from subreddits and users across reddit right in your terminal

reddit_from_terminal stream comments, submissions from subreddits and users across reddit right in your terminal Alert! : Can't watch media contents(p

Pritam Dhara 2 Dec 30, 2021
A CLI tool for creating disposable environments.

dispenv - Disposable Python Environments ⚠️ WIP Need to make an environment to work on a GitHub issue? Want to try out a new package and not leave the

Peter Baumgartner 3 Mar 14, 2022
tiptop is a command-line system monitoring tool in the spirit of top.

Command-line system monitoring. tiptop is a command-line system monitoring tool in the spirit of top. It displays various interesting system stats, gr

Nico Schlömer 1.3k Jan 08, 2023
A lightweight terminal-based password manager coded with Python using SQLCipher for SQLite database encryption.

password-manager A lightweight terminal-based password manager coded with Python using SQLCipher for SQLite database encryption. Screenshot Pre-requis

Leonardo de Araujo 15 Oct 15, 2022
Get COVID-19 vaccination schedules from booking.moh.gov.ge in the CLI

vaccination.py Get COVID-19 vaccination schedules from booking.moh.gov.ge in the CLI. Installation $ pip install vaccination Usage Make sure the Pytho

Temuri Takalandze 11 Dec 08, 2021
WazirX Portfolio Tracker on your Terminal!

If you have been investing in crypto in India, there is a very good chance that you are using WazirX. If you are using WazirX, then you definitely know that there is no P&L report, no green arrows no

Raunit 15 Jan 10, 2022