The goal of pandas-log is to provide feedback about basic pandas operations. It provides simple wrapper functions for the most common functions that add additional logs

Last update: Dec 13, 2022

Related tags

Data Containers pandas-log

Overview

pandas-log

The goal of pandas-log is to provide feedback about basic pandas operations. It provides simple wrapper functions for the most common functions, such as .query, .apply, .merge, .group_by and more.

Why pandas-log?

Pandas-log is a Python implementation of the R package tidylog, and provides a feedback about basic pandas operations.

The pandas has been invaluable for the data science ecosystem and usually consists of a series of steps that involve transforming raw data into an understandable/usable format. These series of steps need to be run in a certain sequence and if the result is unexpected it's hard to understand what happened. Pandas-log log metadata on each operation which will allow to pinpoint the issues.

Lets look at an example, first we need to load pandas-log after pandas and create a dataframe:

import pandas
import pandas_log

with pandas_log.enable():
    df = pd.DataFrame({"name": ['Alfred', 'Batman', 'Catwoman'],
                   "toy": [np.nan, 'Batmobile', 'Bullwhip'],
                   "born": [pd.NaT, pd.Timestamp("1940-04-25"), pd.NaT]})

pandas-log will give you feedback, for instance when filtering a data frame or adding a new variable:

df.assign(toy=lambda x: x.toy.map(str.lower))
  .query("name != 'Batman'")

pandas-log can be especially helpful in longer pipes:

df.assign(toy=lambda x: x.toy.map(str.lower))
  .query("name != 'Batman'")
  .dropna()\
  .assign(lower_name=lambda x: x.name.map(str.lower))
  .reset_index()

For medium article go here

For a full walkthrough go here

Installation

pandas-log is currently installable from PyPI:

pip install pandas-log

Contributing

Follow contribution docs for a full description of the process of contributing to pandas-log.

The goal of pandas-log is to provide feedback about basic pandas operations. It provides simple wrapper functions for the most common functions that add additional logs

Related tags

Overview

pandas-log

Why pandas-log?

Installation

Contributing

Owner

Eyal Trabelsi

The easy way to write your own flavor of Pandas

Create HTML profiling reports from pandas DataFrame objects

The goal of pandas-log is to provide feedback about basic pandas operations. It provides simple wrapper functions for the most common functions that add additional logs

Out-of-Core DataFrames for Python, ML, visualize and explore big tabular data at a billion rows per second 🚀

cuDF - GPU DataFrame Library

Pandas Google BigQuery

Koalas: pandas API on Apache Spark

Universal 1d/2d data containers with Transformers functionality for data analysis.

A Python package for manipulating 2-dimensional tabular data structures

Modin: Speed up your Pandas workflows by changing a single line of code

High performance datastore for time series and tick data

A pure Python implementation of Apache Spark's RDD and DStream interfaces.

sqldf for pandas

A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner

NumPy and Pandas interface to Big Data