Data Inspector is an open-source python library that brings 15++ types of different functions to make EDA, data cleaning easier.

Overview

Data Inspector

Author MIT Contributions welcome Stars Downloads

Data Inspector is an open-source python library that brings 15 types of different functions to make EDA, data cleaning easier.

Author: Kazi Amit Hasan

Project Description:

Data Inspector brings 15++ essential exploratory data analysis, data cleaning automations to make a dataset understandable. This is a perfect tool to get started with you data.

Latest Added Feature:

Added regplots in the library

Installation:

pip install data-inspector

Package available at https://pypi.org/project/data-inspector/

Available automation:

  1. Line plot : line_plot(data, x_data, y_data, x_label="", y_label="", title="")
  2. Skew feature: plot_skewed_feature(data, column)
  3. Showing data distribution: show_distribution(data, column)
  4. Scatter plot: plot_scatter(data,x_data, y_data)
  5. Correlation plot: plot_correlation(data)
  6. Create histogram: histogram(data,column, x_label, y_label, title)
  7. Create bar plot: plot_bar(data, column, xlabel, ylabel, title)
  8. Create boxplots of all features: box_plot(data)
  9. Checking dataset's shape: datasetShape(data)
  10. Get dataset's diagnostic plots: diagnostic_plots(data, variable)
  11. Divide numerical and categorical features: divideFeatures(data)
  12. Fill NaN values: fillNan(data, column, value)
  13. Get pearson's correlation between two variables: get_correlation(column_1, column_2, data)
  14. Plotting kde plots: plot_cont_kde(data, var)
  15. Automatic calculating the missing values and their percentage along with visualization : calculating_missing_values(data)
  16. Regression plot with 95% CI : plot_regplot(data,x_data, y_data)

Tutorial:

Link: https://github.com/AmitHasanShuvo/data-inspector/blob/main/notebook/example%20notebook.ipynb
Colab link: https://colab.research.google.com/drive/1mj9gz2XyQprSYdKMUKlKkJ9Qi8XmleHW?usp=sharing

Some visualizations:



How to cite:

@online{data-inspector,
title={data-inspector},
url={https://pypi.org/project/data-inspector/},
urldate = {2021-08-21}, 
publisher={Kazi Amit Hasan}
}

Future Works:

  1. Add some automations for time series data.

How to contribute:

Any contribution would be highly appreciated. Kindly go through the guidelines for contributing in github.

You might also like...
Sphinx-performance - CLI tool to measure the build time of different, free configurable Sphinx-Projects
Sphinx-performance - CLI tool to measure the build time of different, free configurable Sphinx-Projects

CLI tool to measure the build time of different, free configurable Sphinx-Projec

A module filled with many useful functions and modules in various subjects.
A module filled with many useful functions and modules in various subjects.

Usefulpy Check out the Usefulpy site Usefulpy site is not always up to date Download and Import download and install with with pip download usefulpyth

Template repo to quickly make a tested and documented GitHub action in Python with Poetry

Python + Poetry GitHub Action Template Getting started from the template Rename the src/action_python_poetry package. Globally replace instances of ac

Make posters from Markdown files.
Make posters from Markdown files.

MkPosters Create posters using Markdown. Supports icons, admonitions, and LaTeX mathematics. At the moment it is restricted to the specific layout of

A tutorial for people to run synthetic data replica's from source healthcare datasets
A tutorial for people to run synthetic data replica's from source healthcare datasets

Synthetic-Data-Replica-for-Healthcare Description What is this? A tailored hands-on tutorial showing how to use Python to create synthetic data replic

A Python library for setting up projects using tabular data.

A Python library for setting up projects using tabular data. It can create project folders, standardize delimiters, and convert files to CSV from either individual files or a directory.

Source Code for 'Practical Python Projects' (video) by Sunil Gupta
Source Code for 'Practical Python Projects' (video) by Sunil Gupta

Apress Source Code This repository accompanies %Practical Python Projects by Sunil Gupta (Apress, 2021). Download the files as a zip using the green b

Automatically open a pull request for repositories that have no CONTRIBUTING.md file

automatic-contrib-prs Automatically open a pull request for repositories that have no CONTRIBUTING.md file for a targeted set of repositories. What th

The source code that powers readthedocs.org

Welcome to Read the Docs Purpose Read the Docs hosts documentation for the open source community. It supports Sphinx docs written with reStructuredTex

Releases(eda)
  • eda(Aug 19, 2021)

    Data Inspector brings a total of 15 essential exploratory data analysis, data cleaning automations to make a dataset understandable. This is a perfect tool to get started with you data.

    PYPI link: https://pypi.org/project/data-inspector/

    Source code(tar.gz)
    Source code(zip)
Owner
Kazi Amit Hasan
ML Engineer at ACI Limited | Kaggle Competition Expert (x4) | Researcher
Kazi Amit Hasan
The tutorial is a collection of many other resources and my own notes

Why we need CTC? --- looking back on history 1.1. About CRNN 1.2. from Cross Entropy Loss to CTC Loss Details about CTC 2.1. intuition: forward algor

手写AI 7 Sep 19, 2022
Coursera learning course Python the basics. Programming exercises and tasks

HSE_Python_the_basics Welcome to BAsics programming Python! You’re joining thousands of learners currently enrolled in the course. I'm excited to have

PavelRyzhkov 0 Jan 05, 2022
Leetcode Practice

LeetCode Practice Description This is my LeetCode Practice. Visit LeetCode Website for detailed question description. The code in this repository has

Leo Hsieh 75 Dec 27, 2022
NoVmpy - NoVmpy with python

git clone -b dev-1 https://github.com/wallds/VTIL-Python.git cd VTIL-Python py s

263 Dec 23, 2022
Documentation for GitHub Copilot

NOTE: GitHub Copilot discussions have moved to the Copilot Feedback forum. GitHub Copilot Welcome to the GitHub Copilot user community! In this reposi

GitHub 21.3k Dec 28, 2022
A Material Design theme for MkDocs

A Material Design theme for MkDocs Create a branded static site from a set of Markdown files to host the documentation of your Open Source or commerci

Martin Donath 12.3k Jan 04, 2023
This is the repository that includes the code material for the ESweek 2021 for the Education Class Lecture A3 "Learn to Drive (and Race!) Autonomous Vehicles"

ESweek2021_educationclassA3 This is the repository that includes the code material for the ESweek 2021 for the Education Class Lecture A3 "Learn to Dr

F1TENTH Autonomous Racing Community 29 Dec 06, 2022
A web app builds using streamlit API with python backend to analyze and pick insides from multiple data formats.

Data-Analysis-Web-App Data Analysis Web App can analysis data in multiple formates(csv, txt, xls, xlsx, ods, odt) and gives shows you the analysis in

Kumar Saksham 19 Dec 09, 2022
A curated list of awesome mathematics resources

A curated list of awesome mathematics resources

Cyrille Rossant 6.7k Jan 05, 2023
Word document generator with python

In this study, real world data is anonymized. The content is completely different, but the structure is the same. It was a script I prepared for the backend of a work using UiPath.

Ezgi Turalı 3 Jan 30, 2022
Automated Integration Testing and Live Documentation for your API

Automated Integration Testing and Live Documentation for your API

ScanAPI 1.3k Dec 30, 2022
Material for the ros2 crash course

Material for the ros2 crash course

Emmanuel Dean 1 Jan 22, 2022
🏆 A ranked list of awesome python developer tools and libraries. Updated weekly.

Best-of Python Developer Tools 🏆 A ranked list of awesome python developer tools and libraries. Updated weekly. This curated list contains 250 awesom

Machine Learning Tooling 646 Jan 07, 2023
learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your personal portfolio

learn python in 100 days, a simple step could be follow from beginner to master of every aspect of python programming and project also include side project which you can use as demo project for your

BDFD 6 Nov 05, 2022
FireEye Related Projects

FireEye FireEye Related Projects Tor-IP-Collector Simple python script that will collect a list of TOR IPs from the SecOps Institute Github and inject

Taran Ulrich 2 Nov 12, 2022
A next-generation curated knowledge sharing platform for data scientists and other technical professions.

Knowledge Repo The Knowledge Repo project is focused on facilitating the sharing of knowledge between data scientists and other technical roles using

Airbnb 5.2k Dec 27, 2022
Variable Transformer Calculator

✠ VASCO - VAriable tranSformer CalculatOr Software que calcula informações de transformadores feita para a matéria de "Conversão Eletromecânica de Ene

Arthur Cordeiro Andrade 2 Feb 12, 2022
This repository outlines deploying a local Kubeflow v1.3 instance on microk8s and deploying a simple MNIST classifier using KFServing.

Zero to Inference with Kubeflow Getting Started This repository houses all of the tools, utilities, and example pipeline implementations for exploring

Ed Henry 3 May 18, 2022
Simple yet powerful CAD (Computer Aided Design) library, written with Python.

Py-MADCAD it's time to throw parametric softwares out ! Simple yet powerful CAD (Computer Aided Design) library, written with Python. Installation

jimy byerley 124 Jan 06, 2023
Uses diff command to compare expected output with student's submission output

AUTOGRADER for GRADESCOPE using diff with partial grading Description: Uses diff command to compare expected output with student's submission output U

2 Jan 11, 2022