Exploratory Data Analysis of the 2019 Indian General Elections using a dataset from Kaggle.

Overview

2019-indian-election-eda

Exploratory Data Analysis of the 2019 Indian General Elections using a dataset from Kaggle.

This project is a part of the Course - Data Analysis using Python: Zero to Pandas offered by Jovian.ai.

We perform Exploratory Data Analyis on the 2019 Indian General Elections dataset. Here we use various Python libraries to perform Data Cleaning and Visualization. The Dataset which is used in this project is from Kaggle, authored by the user Prakrut Chauhan.

Link to the Dataset used - https://www.kaggle.com/prakrutchauhan/indian-candidates-for-general-election-2019

The dataset contains information of all the candidates who contested the elections from various Constituencies. Data includes personal information like Assets, Education, Criminal Record, etc. as well as electoral information such as Contesting Constituency, Political Party, Total Votes received, etc.

The Libraries used in the Project are:

  • Matplotlib (for visualization of data),

  • Seaborn (used alongside Matplotlib for visualization),

  • Numpy (used for operations on numeric data),

  • Pandas (used for utilising DataFrames and organising the data),

  • Jovian (used for downloading dataset and to run, save and upload the Notebook).

Apart from the above mentioned libraries, we use the opendatasets package to directly download the files from Kaggle and parse the data. Link to the package - https://github.com/JovianML/opendatasets

To view the Jupyter Notebook containing the EDA, click on the .ipynb file to open it. Scroll down to see the analysis. Some contents might not be visible in Dark Theme, so I recommend viewing the notebook in Light Theme.

The Notebook can also be viewed in Google Colab and Binder or can be downloaded and viewed locally.

Link to a Blog Post will be added soon.

Hope you like my work !!!

Owner
Souradeep Banerjee
Souradeep Banerjee
📊 Python Flask game that consolidates data from Nasdaq, allowing the user to practice buying and selling stocks.

Web Trader Web Trader is a trading website that consolidates data from Nasdaq, allowing the user to search up the ticker symbol and price of any stock

Paulina Khew 21 Aug 30, 2022
A DSL for data-driven computational pipelines

"Dataflow variables are spectacularly expressive in concurrent programming" Henri E. Bal , Jennifer G. Steiner , Andrew S. Tanenbaum Quick overview Ne

1.9k Jan 03, 2023
Analyze the Gravitational wave data stored at LIGO/VIRGO observatories

Gravitational-Wave-Analysis This project showcases how to analyze the Gravitational wave data stored at LIGO/VIRGO observatories, using Python program

1 Jan 23, 2022
Cleaning and analysing aggregated UK political polling data.

Analysing aggregated UK polling data The tweet collection & storage pipeline used in email-service is used to also collect tweets from @britainelects.

Ajay Pethani 0 Dec 22, 2021
Datashader is a data rasterization pipeline for automating the process of creating meaningful representations of large amounts of data.

Datashader is a data rasterization pipeline for automating the process of creating meaningful representations of large amounts of data.

HoloViz 2.9k Jan 06, 2023
This tool parses log data and allows to define analysis pipelines for anomaly detection.

logdata-anomaly-miner This tool parses log data and allows to define analysis pipelines for anomaly detection. It was designed to run the analysis wit

AECID 32 Nov 27, 2022
Randomisation-based inference in Python based on data resampling and permutation.

Randomisation-based inference in Python based on data resampling and permutation.

67 Dec 27, 2022
Employee Turnover Analysis

Employee Turnover Analysis Submission to the DataCamp competition "Can you help reduce employee turnover?"

Jannik Wiedenhaupt 1 Feb 13, 2022
Powerful, efficient particle trajectory analysis in scientific Python.

freud Overview The freud Python library provides a simple, flexible, powerful set of tools for analyzing trajectories obtained from molecular dynamics

Glotzer Group 195 Dec 20, 2022
Full ELT process on GCP environment.

Rent Houses Germany - GCP Pipeline Project: The goal of the project is to extract data about house rentals in Germany, store, process and analyze it u

Felipe Demenech Vasconcelos 2 Jan 20, 2022
A fast, flexible, and performant feature selection package for python.

linselect A fast, flexible, and performant feature selection package for python. Package in a nutshell It's built on stepwise linear regression When p

88 Dec 06, 2022
Pandas and Spark DataFrame comparison for humans

DataComPy DataComPy is a package to compare two Pandas DataFrames. Originally started to be something of a replacement for SAS's PROC COMPARE for Pand

Capital One 259 Dec 24, 2022
Python data processing, analysis, visualization, and data operations

Python This is a Python data processing, analysis, visualization and data operations of the source code warehouse, book ISBN: 9787115527592 Descriptio

FangWei 1 Jan 16, 2022
MeSH2Matrix - A set of Python codes for the generation of biomedical ontologies from the MeSH keywords of the PubMed scholarly publications

A set of Python codes for the generation of biomedical ontologies from the MeSH keywords of the PubMed scholarly publications

SisonkeBiotik 6 Nov 30, 2022
Recommendations from Cramer: On the show Mad-Money (CNBC) Jim Cramer picks stocks which he recommends to buy. We will use this data to build a portfolio

Backtesting the "Cramer Effect" & Recommendations from Cramer Recommendations from Cramer: On the show Mad-Money (CNBC) Jim Cramer picks stocks which

Gábor Vecsei 12 Aug 30, 2022
Sample code for Harry's Airflow online trainng course

Sample code for Harry's Airflow online trainng course You can find the videos on youtube or bilibili. I am working on adding below things: the slide p

102 Dec 30, 2022
CubingB is a timer/analyzer for speedsolving Rubik's cubes, with smart cube support

CubingB is a timer/analyzer for speedsolving Rubik's cubes (and related puzzles). It focuses on supporting "smart cubes" (i.e. bluetooth cubes) for recording the exact moves of a solve in real time.

Zach Wegner 5 Sep 18, 2022
Data Analytics on Genomes and Genetics

Data Analytics performed on On genomes and Genetics dataset to predict genetic disorder and disorder subclass. DONE by TEAM SIGMA!

1 Jan 12, 2022
Pizza Orders Data Pipeline Usecase Solved by SQL, Sqoop, HDFS, Hive, Airflow.

PizzaOrders_DataPipeline There is a Tony who is owning a New Pizza shop. He knew that pizza alone was not going to help him get seed funding to expand

Melwin Varghese P 4 Jun 05, 2022
Semi-Automated Data Processing

Perform semi automated exploratory data analysis, feature engineering and feature selection on provided dataset by visualizing every possibilities on each step and assisting the user to make a meanin

Arun Singh Babal 1 Jan 17, 2022