Intake is a lightweight package for finding, investigating, loading and disseminating data.

Last update: Jan 01, 2023

Overview

Intake: A general interface for loading data

Intake is a lightweight set of tools for loading and sharing data in data science projects. Intake helps you:

Load data from a variety of formats (see the current list of known plugins) into containers you already know, like Pandas dataframes, Python lists, NumPy arrays, and more.
Convert boilerplate data loading code into reusable Intake plugins
Describe data sets in catalog files for easy reuse and sharing between projects and with others.
Share catalog information (and data sets) over the network with the Intake server

Documentation is available at Read the Docs.

Status of intake and related packages is available at Status Dashboard

Weekly news about this repo and other related projects can be found on the wiki

Install

Recommended method using conda:

conda install -c conda-forge intake

You can also install using pip, in which case you have a choice as to how many of the optional dependencies you install, with the simplest having least requirements

pip install intake

and additional sections [server], [plot] and [dataframe], or to include everything:

pip install intake[complete]

Note that you may well need specific drivers and other plugins, which usually have additional dependencies of their own.

Development

Create development Python environment with the required dependencies, ideally with conda. The requirements can be found in the yml files in the scripts/ci/ directory of this repo.
- e.g. conda env create -f scripts/ci/environment-py38.yml and then conda activate test_env
Install intake using pip install -e .[complete]
Use pytest to run tests.
Create a fork on github to be able to submit PRs.
We respect, but do not enforce, pep8 standards; all new code should be covered by tests.

Intake is a lightweight package for finding, investigating, loading and disseminating data.

Related tags

Overview

Intake: A general interface for loading data

Install

Development

Owner

Intake

PyNHD is a part of HyRiver software stack that is designed to aid in watershed analysis through web services.

ASTR 302: Python for Astronomy (Winter '22)

An extension to pandas dataframes describe function.

Desafio proposto pela IGTI em seu bootcamp de Cloud Data Engineer

Incubator for useful bioinformatics code, primarily in Python and R

An Indexer that works out-of-the-box when you have less than 100K stored Documents

Show you how to integrate Zeppelin with Airflow

Jupyter notebooks for the book "The Elements of Statistical Learning".

A collection of learning outcomes data analysis using Python and SQL, from DQLab.

Data imputations library to preprocess datasets with missing data

My solution to the book A Collection of Data Science Take-Home Challenges

A Python package for the mathematical modeling of infectious diseases via compartmental models

Python script to automate the plotting and analysis of percentage depth dose and dose profile simulations in TOPAS.

ASOUL直播间弹幕抓取&&数据分析

Project under the certification "Data Analysis with Python" on FreeCodeCamp

Binance Kline Data With Python

InDels analysis of CRISPR lines by NGS amplicon sequencing technology for a multicopy gene family.

nrgpy is the Python package for processing NRG Data Files

Program that predicts the NBA mvp based on data from previous years.

Example Of Splunk Search Query With Python And Splunk Python SDK