Brandyn WhiteAndrew Miller Source https://github.com/bwhite/hadoopy/ Issues https://github.com/bwhite/hadoopy/issues Docs http://bwhite.github.com/hadoopy/ IRC: #hadoopy @ freenode.net Requirements python development headers (python-dev), build tools (build-essential) Optional cython (>=.13) (without this it falls back to the pregenerated .c files) Features - oozie support - Automated job parallelization 'auto-oozie' available in the hadoopy_flow project (maintained out of branch) - typedbytes support (very fast) - Local execution of unmodified MapReduce job with launch_local - Read/write sequence files of TypedBytes directly to HDFS from python (readtb, writetb) - Works on OS X - Allows printing to stdout and stderr in Hadoop tasks without causing problems (uses the 'pipe hopping' technique, both are available in the task's stderr) - critical path is in Cython - works on clusters without any extra installation, Python, or any Python libraries (uses Pyinstaller that is included in this source tree) - Simple HDFS access (readtb and ls) inside Python, even inside running jobs - Unit test interface - Reporting using status and counters (and print statements! no need to be scared of them in Hadoopy) - Supports design patterns in the Lin/Dyer book ( http://www.umiacs.umd.edu/~jimmylin/book.html) Limitations - Hadoop Local currently unsupported due to a bug in Hadoop's handling of the distributed cache in this mode. Use psuedo-distributed instead for now. ( https://github.com/bwhite/hadoopy/issues/40) Used in - A Case for Query by Image and Text Content: Searching Computer Help using Screenshots and Keywords (to appear in WWW'11) - Web-Scale Computer Vision using MapReduce for Multimedia Data Mining (at KDD'10) - Vitrieve: Visual Search engine - Picarus: Hadoop computer vision toolbox Ubuntu Install (others are similar) sudo apt-get install python-dev build-essential sudo python setup.py install
Python MapReduce library written in Cython.
Overview
Repositório do programa ConstruDelas - Trilha Python - Módulos 1 e 2
ConstruDelas - Introdução ao Python Nome: Visão Geral Bem vinda ao repositório do curso ConstruDelas, módulo de Introdução ao Python. Aqui vamos mante
A way to write regex with objects instead of strings.
Py Idiomatic Regex (AKA iregex) Documentation Available Here An easier way to write regex in Python using OOP instead of strings. Makes the code much
An universal linux port of deezer, supporting both Flatpak and AppImage
Deezer for linux This repo is an UNOFFICIAL linux port of the official windows-only Deezer app. Being based on the windows app, it allows downloading
Turn crypto miner on/off depending on powerwall charge level
Mining Crypto with Tesla Solar and Powerwalls This script turns a crypto miner on and off when the Tesla Powerwall level drops/rises above a certain t
because rico hates uuid's
terrible-uuid-lambda because rico hates uuid's sub 200ms response times! Try it out here: https://api.mathisvaneetvelde.com/uuid https://api.mathisvan
Weblate is a copylefted libre software web-based continuous localization system
Weblate is a copylefted libre software web-based continuous localization system, used by over 2500 libre projects and companies in more than 165 count
The semi-complete teardown of Cosmo's Cosmic Adventure.
The semi-complete teardown of Cosmo's Cosmic Adventure.
Minitel 5 somewhat reverse-engineered
Minitel 5 The Minitel was a french dumb terminal with an embedded modem which had its Golden Age before the rise of Internet. Typically cubic, with an
Just some mtk tool for exploitation, reading/writing flash and doing crazy stuff
Just some mtk tool for exploitation, reading/writing flash and doing crazy stuff. For linux, a patched kernel is needed (see Setup folder) (except for read/write flash). For windows, you need to inst
Given tool find related trending keywords of input keyword
blog_generator Given tool find related trending keywords of input keyword (blog_related_to_keyword). Then cretes a mini blog. Currently its customised
Multifunctional Analysis of Regions through Input-Output
MARIO Multifunctional Analysis of Regions through Input-Output. (Documents) What is it MARIO is a python package for handling input-output tables and
Add your recently blog and douban states in your GitHub Profile
Add your recently blog and douban states in your GitHub Profile
An addin for Autodesk Fusion 360 that lets you view your design in a Looking Glass Portrait 3D display
An addin for Autodesk Fusion 360 that lets you view your design in a Looking Glass Portrait 3D display
Collection of Python scripts to perform Eikonal Tomography
Collection of Python scripts to perform Eikonal Tomography
Python: Wrangled and unpivoted gaming datasets. Tableau: created dashboards - Market Beacon and Player’s Shopping Guide.
Created two information products for GameStop. Using Python, wrangled and unpivoted datasets, and created Tableau dashboards.
Url-check-migration-python - A python script using Apica API's to migrate URL checks between environments
url-check-migration-python A python script using Apica API's to migrate URL chec
The Begin button and menu for the Meadows operating system. The start button for UNIX/Linux.
By: Seanpm2001, Meadows Et; Al. Top README.md Read this article in a different language Sorted by: A-Z Sorting options unavailable ( af Afrikaans Afri
Python flexible slugify function
Python flexible slugify function
The dynamic code loading framework used in LocalStack
localstack-plugin-loader localstack-plugin-loader is the dynamic code loading framework used in LocalStack. Install pip install localstack-plugin-load
My Solutions to 120 commonly asked data science interview questions.
Data_Science_Interview_Questions Introduction 👋 Here are the answers to 120 Data Science Interview Questions The above answer some is modified based