Cash in on Expressed Barcode Tags (EBTs) from NGS Sequencing Data with Python

Overview

Cash in on Expressed Barcode Tags (EBTs) from NGS Sequencing Data with Python

Cashier is a tool developed by Russell Durrett for the analysis and extraction of expressed barcode tags.

This python implementation offers the same flexibility and simple command line operation.

Like it's predecessor it is a wrapper for the tools cutadapt, fastx-toolkit, and starcode.

Dependencies

  • cutadapt (sequence extraction)
  • starcode (sequence clustering)
  • fastx-toolkit (PHred score filtering)
  • pear (paired end read merging)
  • pysam (sam file convertion to fastq)

Recommended Installation Procedure

It's recommended to use conda to install and manage the dependencies for this package

conda env create -f https://raw.githubusercontent.com/brocklab/pycashier/main/environment.yml # or mamba env create -f ....
conda activate cashierenv
pycashier --help

Additionally you may install with pip. Though it will be up to you to ensure all the non-python dependencies are on the path and installed correctly.

pip install pycashier

Usage

Pycashier has one required argument which is the directory containing the fastq or sam files you wish to process.

conda activate cashierenv
pycashier ./fastqs

For additional parameters see pycashier -h.

As the files are processed two additional directories will be created pipeline and outs.

Currently all intermediary files generated as a result of the program will be found in pipeline.

While the final processed files will be found within the outs directory.

Merging Files

Pycashier can now take paired end reads and perform a merging of the reads to produce a fastq which can then be used with cashier's default feature.

pycashier ./fastqs -m

Processing Barcodes from 10X bam files

Pycashier can also extract gRNA barcodes along with 10X cell and umi barcodes.

Firstly we are only interested in the unmapped reads. From the cellranger bam output you would obtain these reads using samtools.

samtools view -f 4 possorted_genome_bam.bam > unmapped.sam

Then similar to normal barcode extraction you can pass a directory of these unmapped sam files to pycashier and extract barcodes. You can also still specify extraction parameters that will be passed to cutadapt as usual.

Note: The default parameters passed to cutadapt are unlinked adapters and minimum barcode length of 10 bp.

pycashier ./unmapped_sams -sc

When finished the outs directory will have a .tsv containing the following columns: Illumina Read Info, UMI Barcode, Cell Barcode, gRNA Barcode

Usage notes

Pycashier will NOT overwrite intermediary files. If there is an issue in the process, please delete either the pipeline directory or the requisite intermediary files for the sample you wish to reprocess. This will allow the user to place new fastqs within the source directory or a project folder without reprocessing all samples each time.

  • Currently, pycashier expects to find .fastq.gz files when merging and .fastq files when extracting barcodes. This behavior may change in the future.
  • If there are reads from multiple lanes they should first be concatenated with cat sample*R1*.fastq.gz > sample.R1.fastq.gz
  • Naming conventions:
    • Sample names are extracted from files using the first string delimited with a period. Please take this into account when naming sam or fastq files.
    • Each processing step will append information to the input file name to indicate changes, again delimited with periods.
Machine Learning powered app to decide whether a photo is food or not.

Food Not Food dot app ( 🍔 🚫 🍔 ) Code for building a machine Learning powered app to decide whether a photo is of food or not. See it working live a

Daniel Bourke 48 Dec 28, 2022
SmartGrid - Een poging tot een optimale SmartGrid oplossing, door Dirk Kuiper & Lars Zwaan

SmartGrid - Een poging tot een optimale SmartGrid oplossing, door Dirk Kuiper & Lars Zwaan

1 Jan 12, 2022
Install packages with pip as if you were in the past!

A PyPI time machine Do you wish you could just install packages with pip as if you were at some fixed date in the past? If so, the PyPI time machine i

Thomas Robitaille 51 Jan 09, 2023
This code can help you with auto update for-TV-advertisements in the store.

Auto-update-files-for-TV-advertisements-in-the-store This code can help you with auto update for-TV-advertisements in the store. It was write for Rasp

Max 2 Feb 20, 2022
PressurePlate is a multi-agent environment that requires agents to cooperate during the traversal of a gridworld.

PressurePlate is a multi-agent environment that requires agents to cooperate during the traversal of a gridworld. The grid is partitioned into several rooms, and each room contains a plate and a clos

Autonomous Agents Research Group (University of Edinburgh) 6 Dec 03, 2022
🏆 A ranked list of awesome Python open-source libraries and tools. Updated weekly.

Best-of Python 🏆 A ranked list of awesome Python open-source libraries & tools. Updated weekly. This curated list contains 230 awesome open-source pr

Machine Learning Tooling 2.7k Jan 03, 2023
Created a Python Keylogger script.

Python Script Simple Keylogger Script WHAT IS IT? Created a Python Keylogger script. HOW IT WORKS Once the script has been executed, it will automatic

AC 0 Dec 12, 2021
Script em python, utilizando PySimpleGUI, para a geração de arquivo txt a ser importado no sistema de Bilhetagem Eletrônica da RioCard, no Estado do Rio de Janeiro.

pedido-vt-riocard Script em python, utilizando PySimpleGUI, para a geração de arquivo txt a ser importado no sistema de Bilhetagem Eletrônica da RioCa

Carlos Bruno Gomes 1 Dec 01, 2021
디텍션 유틸 모음

Object detection utils 유틸모음 설명 링크 convert convert 관련코드 https://github.com/AI-infinyx/ob_utils/tree/main/convert crawl 구글, 네이버, 빙 등 크롤링 관련 https://gith

codetest 41 Jan 22, 2021
This repo will have a small amount of Chrome tools that can be used for DFIR, Hacking, Deception, whatever your heart desires.

Chrome-Tools Overview Welcome to the repo. This repo will have a small amount of Chrome tools that can be used for DFIR, Hacking, Deception, whatever

5 Jun 08, 2022
Site de gestion de cave à vin utilisant une BDD manipulée avec SQLite3 via Python

cave-vin Site de gestion de cave à vin utilisant une bdd manipulée avec MySQL ACCEDER AU SITE : Pour accéder à votre cave vous aurez besoin de lancer

Elouann Lucas 0 Jul 05, 2022
Birthday program - A program that lookups a birthday txt file and compares to the current date to check for birthdays

Birthday Program This is a program that lookups a birthday txt file and compares

Daquiver 4 Feb 02, 2022
Class XII computer science project.

Computer Science Project — Class XII Kshitij Srivastava (XI – A) Introduction The aim of this project is to create a fully operational system for a me

Kshitij Srivastava 2 Jul 21, 2022
🤖🧭Creates google-like navigation menu using python-telegram-bot wrapper

python telegram bot menu pagination Makes a google style pagination line for a list of items. In other words it builds a menu for navigation if you ha

Sergey Smirnov 9 Nov 27, 2022
Notifies server owners of mod updates, also notifies of player deaths and player joins through Discord.

ProjectZomboid-ServerAssistant Notifies server owners of mod updates, also notifies of player deaths and player joins through Discord. A Python based

3 Sep 30, 2022
A Bot Which Can generate Random Account Based On Your Hits.

AccountGenBot This Bot Can Generate Account With Hits You Save (Randomly) Keyfeatures Join To Use Support Limit Account Generation Using Sql Customiza

DevsExpo 30 Oct 21, 2022
Stock Monitoring

Stock Monitoring Description It is a stock monitoring script. This repository is still under developing. Getting Started Prerequisites & Installing pi

Sission 1 Feb 03, 2022
Blender addon, import and update mixamo animation

This is a blender addon for import and update mixamo animations.

ywaby 7 Apr 19, 2022
LanguageCreator - Simple library for easy creation transpilator.

LanguageCreator - Simple library for easy creation transpilator. Create transpilators in one hour! Install. Download code, rename folder to "LanguageC

Ivan Perzhinsky. 2 Dec 31, 2021
A one place destination to check whatever is trending on the top social and news websites at present.

UpTrend A one place destination to check whatever is trending on the top social and news websites at present. Explore the docs » View Demo · Report Bu

Google Developer Student Clubs - JGEC 10 Oct 03, 2021