A spatial genome aligner for analyzing multiplexed DNA-FISH imaging data.

Related tags

Deep Learningjie
Overview

jie

jie is a spatial genome aligner. This package parses true chromatin imaging signal from noise by aligning signals to a reference DNA polymer model.

The codename is a tribute to the Chinese homophones:

  • 结 (jié) : a knot, a nod to the mysterious and often entangled structures of DNA
  • 解 (jiĕ) : to solve, to untie, our bid to uncover these structures amid noise and uncertainty
  • 姐 (jiĕ) : sister, our ability to resolve tightly paired replicated chromatids

Installation

Step 1 - Clone this repository:

git clone https://github.com/b2jia/jie.git
cd jie

Step 2 - Create a new conda environment and install dependencies:

conda create --name jie -f environment.yml
conda activate jie

Step 3 - Install jie:

pip install -e .

To test, run:

python -W ignore test/test_jie.py

Usage

jie is an exposition of chromatin tracing using polymer physics. The main function of this package is to illustrate the utility and power of spatial genome alignment.

jie is NOT an all-purpose spatial genome aligner. Chromatin imaging is a nascent field and data collection is still being standardized. This aligner may not be compatible with different imaging protocols and data formats, among other variables.

We provide a vignette under jie/jupyter/, with emphasis on inspectability. This walks through the intuition of our spatial genome alignment and polymer fiber karyotyping routines:

00-spatial-genome-alignment-walk-thru.ipynb

We also provide a series of Jupyter notebooks (jie/jupyter/), with emphasis on reproducibility. This reproduces figures from our accompanying manuscript:

01-seqFISH-plus-mouse-ESC-spatial-genome-alignment.ipynb
02-seqFISH-plus-mouse-ESC-polymer-fiber-karyotyping.ipynb
03-seqFISH-plus-mouse-brain-spatial-genome-alignment.ipynb
04-seqFISH-plus-mouse-brain-polymer-fiber-karyotyping.ipynb
05-bench-mark-spatial-genome-agignment-against-chromatin-tracing-algorithm.ipynb

A command-line tool forthcoming.

Motivation

Multiplexed DNA-FISH is a powerful imaging technology that enables us to peer directly at the spatial location of genes inside the nucleus. Each gene appears as tiny dot under imaging.

Pivotally, figuring out which dots are physically linked would trace out the structure of chromosomes. Unfortunately, imaging is noisy, and single-cell biology is extremely variable. The two confound each other, making chromatin tracing prohibitively difficult!

For instance, in a diploid cell line with two copies of a gene we expect to see two spots. But what happens when we see:

  • Extra signals:
    • Is it noise?
      • Off-target labeling: The FISH probes might inadvertently label an off-target gene
    • Or is it biological variation?
      • Aneuploidy: A cell (ie. cancerous cell) may have more than one copy of a gene
      • Cell cycle: When a cell gets ready to divide, it duplicates its genes
  • Missing signals:
    • Is it noise?
      • Poor probe labeling: The FISH probes never labeled the intended target gene
    • Or is it biological variation?
      • Copy Number Variation: A cell may have a gene deletion

If true signal and noise are indistinguishable, how do we know we are selecting true signals during chromatin tracing? It is not obvious which spots should be connected as part of a chromatin fiber. This dilemma was first aptly characterized by Ross et al. (https://journals.aps.org/pre/abstract/10.1103/PhysRevE.86.011918), which is nothing short of prescient...!

jie is, conceptually, a spatial genome aligner that disambiguates spot selection by checking each imaged signal against a reference polymer physics model of chromatin. It relies on the key insight that the spatial separation between two genes should be congruent with its genomic separation.

It makes no assumptions about the expected copy number of a gene, and when it traces chromatin it does so instead by evaluating the physical likelihood of the chromatin fiber. In doing so, we can uncover copy number variations and even sister chromatids from multiplexed DNA-FISH imaging data.

Citation

Contact

Author: Bojing (Blair) Jia
Email: b2jia at eng dot ucsd dot edu
Position: MD-PhD Student, Ren Lab

For other work related to single-cell biology, 3D genome, and chromatin imaging, please visit Prof. Bing Ren's website: http://renlab.sdsc.edu/

Owner
Bojing Jia
How do we better describe the world around us?
Bojing Jia
This is the official code of L2G, Unrolling and Recurrent Unrolling in Learning to Learn Graph Topologies.

Learning to Learn Graph Topologies This is the official code of L2G, Unrolling and Recurrent Unrolling in Learning to Learn Graph Topologies. Requirem

Stacy X PU 16 Dec 09, 2022
Configure SRX interfaces with Scrapli

Configure SRX interfaces with Scrapli Overview This example will show how to configure interfaces on Juniper's SRX firewalls. In addition to the Pytho

Calvin Remsburg 1 Jan 07, 2022
Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

Transformer-vocabulary-transfer Implementation of the paper "Fine-Tuning Transfo

LEYA 13 Nov 30, 2022
🎃 Core identification module of AI powerful point reading system platform.

ppReader-Kernel Intro Core identification module of AI powerful point reading system platform. Usage 硬件: Windows10、GPU:nvdia GTX 1060 、普通RBG相机 软件: con

CrashKing 1 Jan 11, 2022
A Unified Generative Framework for Various NER Subtasks.

This is the code for ACL-ICJNLP2021 paper A Unified Generative Framework for Various NER Subtasks. Install the package in the requirements.txt, then u

177 Jan 05, 2023
C3DPO - Canonical 3D Pose Networks for Non-rigid Structure From Motion.

C3DPO: Canonical 3D Pose Networks for Non-Rigid Structure From Motion By: David Novotny, Nikhila Ravi, Benjamin Graham, Natalia Neverova, Andrea Vedal

Meta Research 309 Dec 16, 2022
Baseline of DCASE 2020 task 4

Couple Learning for SED This repository provides the data and source code for sound event detection (SED) task. The improvement of the Couple Learning

21 Oct 18, 2022
Continuous Query Decomposition for Complex Query Answering in Incomplete Knowledge Graphs

Continuous Query Decomposition This repository contains the official implementation for our ICLR 2021 (Oral) paper, Complex Query Answering with Neura

UCL Natural Language Processing 71 Dec 29, 2022
Diverse graph algorithms implemented using JGraphT library.

# 1. Installing Maven & Pandas First, please install Java (JDK11) and Python 3 if they are not already. Next, make sure that Maven (for importing J

See Woo Lee 3 Dec 17, 2022
Open source person re-identification library in python

Open-ReID Open-ReID is a lightweight library of person re-identification for research purpose. It aims to provide a uniform interface for different da

Tong Xiao 1.3k Jan 01, 2023
Dilated Convolution with Learnable Spacings PyTorch

Dilated-Convolution-with-Learnable-Spacings-PyTorch Ismail Khalfaoui Hassani Dilated Convolution with Learnable Spacings (abbreviated to DCLS) is a no

15 Dec 09, 2022
Convex optimization for fun and profit.

CFMM Optimal Routing This repository contains the code needed to generate the figures used in the paper Optimal Routing for Constant Function Market M

Guillermo Angeris 183 Dec 29, 2022
A Simplied Framework of GAN Inversion

Framework of GAN Inversion Introcuction You can implement your own inversion idea using our repo. We offer a full range of tuning settings (in hparams

Kangneng Zhou 13 Sep 27, 2022
Codes for "Solving Long-tailed Recognition with Deep Realistic Taxonomic Classifier"

Deep-RTC [project page] This repository contains the source code accompanying our ECCV 2020 paper. Solving Long-tailed Recognition with Deep Realistic

Gina Wu 16 May 26, 2022
Dcf-game-infrastructure-public - Contains all the components necessary to run a DC finals (attack-defense CTF) game from OOO

dcf-game-infrastructure All the components necessary to run a game of the OOO DC

Order of the Overflow 46 Sep 13, 2022
This is a TensorFlow implementation for C2-Rec

This is a TensorFlow implementation for C2-Rec We refer to the repo SASRec. Requirements requirement.txt Datasets This repo includes Amazon Beauty dat

7 Nov 14, 2022
Code release for DS-NeRF (Depth-supervised Neural Radiance Fields)

Depth-supervised NeRF: Fewer Views and Faster Training for Free Project | Paper | YouTube Pytorch implementation of our method for learning neural rad

524 Jan 08, 2023
Code for the paper 'A High Performance CRF Model for Clothes Parsing'.

Clothes Parsing Overview This code provides an implementation of the research paper: A High Performance CRF Model for Clothes Parsing Edgar Simo-S

Edgar Simo-Serra 119 Nov 21, 2022
An attempt at the implementation of GLOM, Geoffrey Hinton's paper for emergent part-whole hierarchies from data

GLOM TensorFlow This Python package attempts to implement GLOM in TensorFlow, which allows advances made by several different groups transformers, neu

Rishit Dagli 32 Feb 21, 2022
Open-World Entity Segmentation

Open-World Entity Segmentation Project Website Lu Qi*, Jason Kuen*, Yi Wang, Jiuxiang Gu, Hengshuang Zhao, Zhe Lin, Philip Torr, Jiaya Jia This projec

DV Lab 410 Jan 03, 2023