A variant caller for the GBA gene using WGS data

Related tags

MiscellaneousGauchian
Overview

Gauchian: WGS-based GBA variant caller

Gauchian is a targeted variant caller for the GBA gene based on a whole-genome sequencing (WGS) BAM file. Gauchian uses a novel method to solve the problems caused by the high sequence similarity with the pseudogene paralog GBAP1 and is able to detect variants accurately in the Exons 9-11 homology region, such as large deletions or duplications between GBA and GBAP1, and GBAP1-like variants in GBA, including p.A495P, p.L483P, p.D448H, c.1263del, RecNciI, RecTL and c.1263del+RecTL. In addition to these challenging variants, Gauchian also calls known pathogenic or likely pathogenic GBA variants classified in ClinVar. Please refer to our preprint for more details about the method.

Running the program

This Python3 program can be run as follows:

python -m gauchian --manifest MANIFEST_FILE \
                   --genome [19/37/38] \
                   --prefix OUTPUT_FILE_PREFIX \
                   --outDir OUTPUT_DIRECTORY \
                   --threads NUMBER_THREADS

The manifest is a text file in which each line should list the absolute path to an input BAM/CRAM file. For CRAM input, it’s suggested to provide the path to the reference fasta file with --reference in the command.

Interpreting the output

The program produces a .tsv file in the directory specified by --outDir. The fields are explained below:

Fields in tsv Explanation
Sample Sample name
is_biallelic_GBAP1-like_variant_exon9-11 Whether the sample is called as biallelic for GBAP1-like variants in exon9-11
is_carrier_GBAP1-like_variant_exon9-11 Whether the sample is called as a carrier for GBAP1-like variants in exon9-11
total_CN Total copy number of GBA+GBAP1
deletion_breakpoint_in_GBA_gene Whether the deletion breakpoint is in GBA gene if a deletion exists
GBAP1-like_variant_exon9-11 GBAP1-like variants called in exon9-11, two alleles separated by /
other_variants Other variants called (non-GBAP1-like variants or variants outside of exon9-11)

A .json file is also produced that contains more information about each sample.

Fields in json Explanation
Coverage_MAD Median absolute deviation of depth, measure of sample quality
Median_depth Sample median depth
deletion_CN CN of the unique region between GBA and GBAP1. This value plus 2 is the total CN
deletion_CN_raw Raw normalized depth of the unique region between GBA and GBAP1
variant_raw_count Supporting reads for each variant
snp_call GBA copy number call at GBA/GBAP1 differentiating sites
snp_raw Raw GBA copy number at GBA/GBAP1 differentiating sites
haplotypes Summary of haplotypes assembled across GBA/GBAP1 differentiating sites in Exon9-11
You might also like...
Data Structures and Algorithms Python - Practice data structures and algorithms in python with few small projects

Data Structures and Algorithms All the essential resources and template code nee

Adansons Base is a data management tool that organizes metadata of unstructured data and creates and organizes datasets.

Adansons Base is a data management tool that organizes metadata of unstructured data and creates and organizes datasets. It makes dataset creation more effective and helps find essential insights from training results and improves AI performance.

Open-source data observability for modern data teams
Open-source data observability for modern data teams

Use cases Monitor your data warehouse in minutes: Data anomalies monitoring as dbt tests Data lineage made simple, reliable, and automated dbt operati

A demo of a data science project using Kedro

iris Overview This is your new Kedro project, which was generated using Kedro 0.17.4. Take a look at the Kedro documentation to get started. Rules and

Data Poisoning based on Adversarial Attacks using Non-Robust Features

Data Poisoning based on Adversarial Attacks using Non-Robust Features Usage python main.py [-h] [--gpu | -g GPU] [--eps |-e EPSILON] [--pert | -p PER

Cisco IOS-XE Operations Program. Shows operational data using restconf and yang
Cisco IOS-XE Operations Program. Shows operational data using restconf and yang

XE-Ops View operational and config data from devices running Cisco IOS-XE software. NoteS The build folder is the latest build. All other files are fo

Run python scripts and pass data between multiple python and node processes using this npm module

Run python scripts and pass data between multiple python and node processes using this npm module. process-communication has a event based architecture for interacting with python data and errors inside nodejs.

ARRU seismic backprojection - Earthquake waveform detection and P/S arrivals picking on continuous data using ARRU phase picker Download and process GOES-16 and GOES-17 data from NOAA's archive on AWS using Python.
Download and process GOES-16 and GOES-17 data from NOAA's archive on AWS using Python.

Download and display GOES-East and GOES-West data GOES-East and GOES-West satellite data are made available on Amazon Web Services through NOAA's Big

Comments
  • UserWarning: multiple_iterators not implemented for CRAM

    UserWarning: multiple_iterators not implemented for CRAM

    When running with .cram file, got the following warnings /gauchian/depth_calling/snp_count.py:131: UserWarning: multiple_iterators not implemented for CRAM ignore_orphan=False /gauchian/depth_calling/haplotype.py:189: UserWarning: multiple_iterators not implemented for CRAM min_base_quality=13

    Will these warnings affect the quality of calls?

    opened by LNGDingj 1
Releases(v1.0.2)
Owner
Illumina
Illumina Open Source Software
Illumina
Types for the Rasterio package

types-rasterio Types for the rasterio package A work in progress Install Not yet published to PyPI pip install types-rasterio These type definitions

Kyle Barron 7 Sep 10, 2021
Static bytecode simulator

SEA Static bytecode simulator for creating dependency/dependant based experimental bytecode format for CPython. Example a = random() if a = 5.0:

Batuhan Taskaya 23 Jun 10, 2022
Fastest python library for making asynchronous group requests.

FGrequests: Fastest Asynchronous Group Requests Installation Install using pip: pip install fgrequests Documentation Pretty easy to use. import fgrequ

Farid Chowdhury 14 Nov 22, 2022
A Python library that helps data scientists to infer causation rather than observing correlation.

A Python library that helps data scientists to infer causation rather than observing correlation.

QuantumBlack Labs 1.7k Jan 04, 2023
Graphene Metanode is a locally hosted node for one account and several trading pairs, which uses minimal RAM resources.

Graphene Metanode is a locally hosted node for one account and several trading pairs, which uses minimal RAM resources. It provides the necessary user stream data and order book data for trading in a

litepresence 5 May 08, 2022
Web app to find your chance of winning at Texas Hold 'Em

poker_mc Web app to find your chance of winning at Texas Hold 'Em A working version of this project is deployed at poker-mc.ue.r.appspot.com. It's run

Aadith Vittala 7 Sep 15, 2021
Easytile blender - Simple Blender 2.83 addon for tiling meshes easily

easytile_blender Dead simple, barebones Blender (2.83) addon for placing meshes as tiles. Installation In Blender, go to Edit Preferences Add-ons

Sam Gibson 6 Jul 19, 2022
A set of tools for ripping music from Konami mobile games

Konami Mobile Ripping Toolset A set of tools for ripping music from Konami mobile games Contents nigger.py for niggering konami's website, ripping all

5 Oct 20, 2022
Python Projects is an Open Source to enhance your python skills

Welcome! 👋🏽 Python Project is Open Source to enhance your python skills. You're free to contribute. 🤓 You just need to give us your scripts written

Tristán 6 Nov 28, 2022
Taxonomy addition for complete trees

TACT: Taxonomic Addition for Complete Trees TACT is a Python app for stochastic polytomy resolution. It uses birth-death-sampling estimators across an

Jonathan Chang 3 Jun 07, 2022
⚙️ Compile, Read and update your .conf file in python

⚙️ Compile, Read and update your .conf file in python

Reece Harris 2 Aug 15, 2022
NYCU(NCTU)-差勤-助教

NCTU-TA-fill 填寫 差勤-助教時數 有沒有覺得在差勤系統填助教時數有點浪費生命? 今天有個懶鬼浪費好多時間幫大家寫了code 只要填好的必要的資料,就可以讓電腦自動幫你完成差勤助教的時數填寫喔! https://pt-attendance.nctu.edu.tw/verify/userL

14 Dec 21, 2021
RangDev Notepad App With Python

RangDev Notepad-App-With-Python Take down quick and speedy notes! This is a small project of a notepad app built with Tkinter and SQLite3. Database cr

rangga.alrasya 1 Dec 01, 2021
A web UI for managing your 351ELEC device ROMs.

351ELEC WebUI A web UI for managing your 351ELEC device ROMs. Requirements Python 3 or Python 2.7 are required. If the ftfy package is installed, it w

Ben Phelps 5 Sep 26, 2022
Library for Memory Trace Statistics in Python

Memory Search Library for Memory Trace Statistics in Python The library uses tracemalloc as a core module, which is why it is only available for Pytho

Memory Search 1 Dec 20, 2021
Code for the manim-generated scenes used in 3blue1brown videos

This project contains the code used to generate the explanatory math videos found on 3Blue1Brown. This almost entirely consists of scenes generated us

Grant Sanderson 4.1k Jan 02, 2023
Encode and decode cancro lang files to and from brainfuck

cancrolang Encode and decode cancro lang files to and from brainfuck. examples python3 main.py -f hello.cancro --run Hello World! the interpreter is n

witer33 1 Dec 20, 2021
VCM EE1.2 P-layer feature map anchor generation 137th MPEG-VCM

VCM EE1.2 P-layer feature map anchor generation 137th MPEG-VCM

IPSL 6 Oct 18, 2022
Force you (or your user) annotate Python function type hints.

Must-typing Force you (or your user) annotate function type hints. Notice: It's more like a joke, use it carefully. If you call must_typing in your mo

Konge 13 Feb 19, 2022
A tool to determine optimal projects for Gridcoin crunchers. Maximize your magnitude!

FindTheMag FindTheMag helps optimize your BOINC client for Gridcoin mining. You can group BOINC projects into two groups: "preferred" projects and "mi

7 Oct 04, 2022