Churn prediction with PySpark

Last update: Aug 13, 2021

Related tags

Data Analysis Churn_Prediction

Overview

Churn Prediction

Objective

It is expected to develop a machine learning model that can predict customers who will leave the company.

About Dataset

Consists of 10000 observations and 12 variables.
The independent variables contain information about customers.
The dependent variable represents the customer abandonment status.

Variables

Surname – Customer surname
CreditScore – Customer's credit score
Geography – Country where the customer is located
Gender – Customer's gender
Age – Customer's age
Tenure – Information on how many years of customer it is
NumOfProducts – Used bank product
HasCrCard – Credit card status (0=No,1=Yes)
IsActiveMember – Active Membership status (0=No,1=Yes)
EstimatedSalary – Customer's estimated salary
Exited: – Exited or not (0=No,1=Yes)

Owner

GitHub Repository

Python Package for DataHerb: create, search, and load datasets.

The Python Package for DataHerb A DataHerb Core Service to Create and Load Datasets.

4 Feb 11, 2022

Feature Detection Based Template Matching

Feature Detection Based Template Matching The classification of the photos was made using the OpenCv template Matching method. Installation Use the pa

2 Nov 18, 2021

This tool parses log data and allows to define analysis pipelines for anomaly detection.

logdata-anomaly-miner This tool parses log data and allows to define analysis pipelines for anomaly detection. It was designed to run the analysis wit

32 Nov 27, 2022

MIR Cheatsheet - Survival Guidebook for MIR Researchers in the Lab

3 Jul 02, 2022

Big Data & Cloud Computing for Oceanography

DS2 Class 2022, Big Data & Cloud Computing for Oceanography Home of the 2022 ISblue Big Data & Cloud Computing for Oceanography class (IMT-A, ENSTA, I

5 Mar 19, 2022

A data parser for the internal syncing data format used by Fog of World.

A data parser for the internal syncing data format used by Fog of World. The parser is not designed to be a well-coded library with good performance, it is more like a demo for showing the data struc

40 Dec 12, 2022

A lightweight, hub-and-spoke dashboard for multi-account Data Science projects

A lightweight, hub-and-spoke dashboard for cross-account Data Science Projects Introduction Modern Data Science environments often involve many indepe

3 Oct 30, 2021

InDels analysis of CRISPR lines by NGS amplicon sequencing technology for a multicopy gene family.

CRISPRanalysis InDels analysis of CRISPR lines by NGS amplicon sequencing technology for a multicopy gene family. In this work, we present a workflow

2 Jan 31, 2022

cLoops2: full stack analysis tool for chromatin interactions

cLoops2: full stack analysis tool for chromatin interactions Introduction cLoops2 is an extension of our previous work, cLoops. From loop-calling base

25 Dec 14, 2022

Data Science Environment Setup in single line

datascienv is package that helps your to setup your environment in single line of code with all dependency and it is also include pyforest that provide single line of import all required ml libraries

55 Dec 16, 2022

A python package which can be pip installed to perform statistics and visualize binomial and gaussian distributions of the dataset

GBiStat package A python package to assist programmers with data analysis. This package could be used to plot : Binomial Distribution of the dataset p

4 Oct 17, 2022

Churn prediction with PySpark

Related tags

Overview

Churn Prediction

Objective

About Dataset

Variables

Owner

Python Package for DataHerb: create, search, and load datasets.

Feature Detection Based Template Matching

This tool parses log data and allows to define analysis pipelines for anomaly detection.

MIR Cheatsheet - Survival Guidebook for MIR Researchers in the Lab

Big Data & Cloud Computing for Oceanography

A data parser for the internal syncing data format used by Fog of World.

A lightweight, hub-and-spoke dashboard for multi-account Data Science projects

InDels analysis of CRISPR lines by NGS amplicon sequencing technology for a multicopy gene family.

cLoops2: full stack analysis tool for chromatin interactions

Data Science Environment Setup in single line

A python package which can be pip installed to perform statistics and visualize binomial and gaussian distributions of the dataset

Fit models to your data in Python with Sherpa.

My first Python project is a simple Mad Libs program.

Elasticsearch tool for easily collecting and batch inserting Python data and pandas DataFrames

Airflow ETL With EKS EFS Sagemaker

Python Library for learning (Structure and Parameter) and inference (Statistical and Causal) in Bayesian Networks.

Very basic but functional Kakuro solver written in Python.

INFO-H515 - Big Data Scalable Analytics

An Integrated Experimental Platform for time series data anomaly detection.

Tkinter Izhikevich Neuron Model With Python