PySpark ML Bank Churn Prediction

Overview

PySpark-Bank-Churn

  • Surname: corresponds to the record (row) number and has no effect on the output.
  • CreditScore: contains random values and has no effect on customer leaving the bank.
  • Geography: a customer’s location can affect their decision to leave the bank.
  • Gender: it’s interesting to explore whether gender plays a role in a customer leaving the bank.
  • Age: this is certainly relevant, since older customers are less likely to leave their bank than younger ones.
  • Tenure: refers to the number of years that the customer has been a client of the bank. Normally, older clients are more loyal and less likely to leave a bank.
  • NumOfProducts: refers to the number of products that a customer has purchased through the bank.
  • HasCrCard: denotes whether or not a customer has a credit card. This column is also relevant, since people with a credit card are less likely to leave the bank.
  • IsActiveMember: active customers are less likely to leave the bank.
  • EstimatedSalary: as with balance, people with lower salaries are more likely to leave the bank compared to those with higher salaries.
  • Exited: (Dependent Variable): whether or not the customer left the bank.
  • Balance:also a very good indicator of customer churn, as people with a higher balance in their accounts are less likely to leave the bank compared to those with lower balances.

Acknowledgements

As we know, it is much more expensive to sign in a new client than keeping an existing one.

It is advantageous for banks to know what leads a client towards the decision to leave the company.

Churn prevention allows companies to develop loyalty programs and retention campaigns to keep as many customers as possible.

Owner
kemalgunay
Ph.D | Data Science Researcher
kemalgunay
Visualize classified time series data with interactive Sankey plots in Google Earth Engine

sankee Visualize changes in classified time series data with interactive Sankey plots in Google Earth Engine Contents Description Installation Using P

Aaron Zuspan 76 Dec 15, 2022
easyNeuron is a simple way to create powerful machine learning models, analyze data and research cutting-edge AI.

easyNeuron is a simple way to create powerful machine learning models, analyze data and research cutting-edge AI.

Neuron AI 5 Jun 18, 2022
Repositório para o #alurachallengedatascience1

1° Challenge de Dados - Alura A Alura Voz é uma empresa de telecomunicação que nos contratou para atuar como cientistas de dados na equipe de vendas.

Sthe Monica 16 Nov 10, 2022
A pure-python implementation of the UpSet suite of visualisation methods by Lex, Gehlenborg et al.

pyUpSet A pure-python implementation of the UpSet suite of visualisation methods by Lex, Gehlenborg et al. Contents Purpose How to install How it work

288 Jan 04, 2023
Highly interpretable classifiers for scikit learn, producing easily understood decision rules instead of black box models

Highly interpretable, sklearn-compatible classifier based on decision rules This is a scikit-learn compatible wrapper for the Bayesian Rule List class

Tamas Madl 482 Nov 19, 2022
End to End toy example of MLOps

churn_model MLOps Toy Example End to End You might find below links useful Connect VSCode to Git MLFlow Port Heroku App Project Organization ├── LICEN

Ashish Tele 6 Feb 06, 2022
Code Repository for Machine Learning with PyTorch and Scikit-Learn

Code Repository for Machine Learning with PyTorch and Scikit-Learn

Sebastian Raschka 1.4k Jan 03, 2023
BigDL: Distributed Deep Learning Framework for Apache Spark

BigDL: Distributed Deep Learning on Apache Spark What is BigDL? BigDL is a distributed deep learning library for Apache Spark; with BigDL, users can w

4.1k Jan 09, 2023
Cryptocurrency price prediction and exceptions in python

Cryptocurrency price prediction and exceptions in python This is a coursework on foundations of computing module Through this coursework i worked on m

Panagiotis Sotirellos 1 Nov 07, 2021
A repository to work on Machine Learning course. Select an algorithm to classify writer's gender, of Hebrew texts.

MachineLearning A repository to work on Machine Learning course. Select an algorithm to classify writer's gender, of Hebrew texts. Tested algorithms:

Haim Adrian 1 Feb 01, 2022
Skforecast is a python library that eases using scikit-learn regressors as multi-step forecasters

Skforecast is a python library that eases using scikit-learn regressors as multi-step forecasters. It also works with any regressor compatible with the scikit-learn API (pipelines, CatBoost, LightGBM

Joaquín Amat Rodrigo 297 Jan 09, 2023
QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

152 Jan 02, 2023
Stock Price Prediction Bank Jago Using Facebook Prophet Machine Learning & Python

Stock Price Prediction Bank Jago Using Facebook Prophet Machine Learning & Python Overview Bank Jago has attracted investors' attention since the end

Najibulloh Asror 3 Feb 10, 2022
LightGBM + Optuna: no brainer

AutoLGBM LightGBM + Optuna: no brainer auto train lightgbm directly from CSV files auto tune lightgbm using optuna auto serve best lightgbm model usin

Rishiraj Acharya 22 Dec 15, 2022
A python library for easy manipulation and forecasting of time series.

Time Series Made Easy in Python darts is a python library for easy manipulation and forecasting of time series. It contains a variety of models, from

Unit8 5.2k Jan 04, 2023
A toolkit for geo ML data processing and model evaluation (fork of solaris)

An open source ML toolkit for overhead imagery. This is a beta version of lunular which may continue to develop. Please report any bugs through issues

Ryan Avery 4 Nov 04, 2021
Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.

Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.

FINRA 25 Dec 28, 2022
ZenML 🙏: MLOps framework to create reproducible ML pipelines for production machine learning.

ZenML is an extensible, open-source MLOps framework to create production-ready machine learning pipelines. It has a simple, flexible syntax, is cloud and tool agnostic, and has interfaces/abstraction

ZenML 2.6k Jan 08, 2023
Machine Learning from Scratch

Machine Learning from Scratch Author: Shengxuan Wang From: Oregon State University Content: Building Machine Learning model from Scratch, without usin

ShawnWang 0 Jul 05, 2022
Python factor analysis library (PCA, CA, MCA, MFA, FAMD)

Prince is a library for doing factor analysis. This includes a variety of methods including principal component analysis (PCA) and correspondence anal

Max Halford 915 Dec 31, 2022