Pyspark sam - Analyze Big Sequence Alignments with PySpark in AWS EMR

Overview

pyspark_sam

This repo hosts my code for the article "Analyze Big Sequence Alignments with PySpark in AWS EMR".

Prerequisite

  1. Spark

  2. AWS CLI

  3. AWS Account

Run

Follow the instruction in the article. Once you have uploaded the files into your S3 bucket, run

aws emr create-cluster --name "Spark_step_pip" \
    --release-label emr-6.5.0 \
    --applications Name=Spark \
    --log-uri s3://[your_S3_bucket]/logs/ \
    --instance-type m5.xlarge \
    --instance-count 3 \
    --bootstrap-actions Path=s3://[your_S3_bucket]/emr_bootstrap.sh \
    --use-default-roles --auto-terminate \
    --steps "Type=Spark,Name=SparkProgram,ActionOnFailure=CONTINUE,Args=[--deploy-mode,cluster,--master,yarn,--py-files,s3://[your_S3_bucket]/helper_function.py,s3://[your_S3_bucket]/spark_3mer.py,s3://[your_S3_bucket]/test.sam,[your_S3_bucket],sankey.json]" 

When the job finishes, download the sankey.json. And run this command to visualize:

python sankey.py sankey.json

Authors

  • Sixing Huang - Concept and Coding

License

This project is licensed under the MIT License - see the LICENSE file for details

Owner
Sixing Huang
A triple Neo4j certified data scientist. I am currently working at BGI in Shenzhen.
Sixing Huang
Python binding to the OneTimeSecret API

Thin Python binding for onetimesecret.com API. Unicode-safe. Description of API itself you can find here: https://onetimesecret.com/docs/api Usage:

Vladislav Stepanov 10 Jun 12, 2022
ABACUS Aroio API for Webinterfaces and App-Connections

ABACUS Aroio API for Webinterfaces and App-Connections Setup Start virtual python environment if you don't have python3 running setup: $ python3 -m ve

Abacus Aroio Developer Team 1 Apr 01, 2021
Get random jokes bapack2 from jokes-bapack2-api

Random Jokes Bapack2 Get random jokes bapack2 from jokes-bapack2-api Requirements Python Requests HTTP library How to Run py random-jokes-bapack2.py T

Miftah Afina 1 Nov 18, 2021
You can submit any PR and have SWAGS. Happy Hacktoberfest !

Excluded project Repository 🔴 🔴 🔴 - PR limit is reached. Please use another Repository Hacktoberfest 2021 🎉 🗣 Hacktoberfest encourages participat

Hansajith 63 Oct 21, 2022
Keypirinha plugin to install packages via Chocolatey

Keypiriniha Chocolatey This is a package for the fast keystroke launcher keypirinha (http://keypirinha.com/) It allows you to search & install package

Shadab Zafar 4 Nov 26, 2022
Discord bot script for sending multiple media files to a discord channel according to discord limitations.

Discord Bulk Image Sending Bot Send bulk images to Discord channel. This is a bot script that will allow you to send multiple images to Discord channe

Nikola Arbov 1 Jan 13, 2022
Download nitro generator that generates free nitro code that you can use for Discord

Download nitro generator that generates free nitro code that you can use for Discord, run it and wait for free nitro to come

Umut Bayraktar 154 Jan 05, 2023
Ethereum transactions and wallet information for people you follow on Twitter.

ethFollowing Ethereum transactions and wallet information for people you follow on Twitter. Set up Setup python environment (requires python 3.8): vir

Brian Donohue 2 Dec 28, 2021
go-cqhttp API typing annoations, return data models and utils for nonebot

go-cqhttp API typing annoations, return data models and utils for nonebot

风屿 6 Jan 04, 2023
NFTs Upload to OpenSea CuseEdition

NFTs-Upload-to-OpenSea-CuseEdition YOUTUBE VIDEO - Soon... Download Python and

Lil Cuse 2 Jan 04, 2022
A Discord bot that generates inspirational quotes & motivating messages whenever a user is sad

Encourage bot is a discord bot that allows users to randomly get Inspirational quotes messages and gives motivational encouragements whenever someone says that he's sad/depressed.

1 Nov 25, 2021
Python client for the iNaturalist APIs

pyinaturalist Introduction iNaturalist is a community science platform that helps people get involved in the natural world by observing and identifyin

Nicolas Noé 79 Dec 22, 2022
Script to automatically book a vaccine slot on Doctolib for today or tomorrow, following rules from the French Government.

DOCTOSHOTGUN This script lets you automatically book a vaccine slot on Doctolib for today or tomorrow, following rules from the French Government. Pyt

Romain Bignon 560 Dec 19, 2022
A telegram bot for generate fake details. Written in python using telethon

FakeDataGenerator A telegram bot for generate fake details. Written in python using telethon. Mandatory variables API_HASH Get it from my telegram.org

Oxidised-Man 6 Dec 19, 2021
Snipe fair coin launches. Contact @dannsniper on telegram for whitelist

Pancakeswap-sniper Pancakeswap Sniper bot Full version of Pancakeswap sniping bot used to snipe during fair coin launches. With advanced options and a

36 Nov 01, 2021
Easy-apply-bot - A LinkedIn Easy Apply bot to help with my job search.

easy-apply-bot A LinkedIn Easy Apply bot to help with my job search. Getting Started First, clone the repository somewhere onto your computer, or down

Matthew Alunni 5 Dec 09, 2022
Senexia - A powerful telegram bot to manage your groups as effectively as possible

⚡ Kenechi bot ⚡ A Powerful, Smart And Simple Group Manager ... Written with AioG

Akhi 2 Jan 11, 2022
Discord Panel is an AIO panel for Discord that aims to have all the needed tools related to user token interactions, as in nuking and also everything you could possibly need for raids

Discord Panel Discord Panel is an AIO panel for Discord that aims to have all the needed tools related to user token interactions, as in nuking and al

11 Mar 30, 2022
Termux Pkg

PKG Install Termux All Basic Pkg. Installation : pkg update && pkg upgrade && pkg install python && pkg install python2 && pkg install git && git clon

ɴᴏʙɪᴛᴀシ︎ 1 Oct 28, 2021