A classification model capable of accurately predicting the price of secondhand cars

Overview

Title: Secondhand-Car-Price-Predictor

-- Project Status: [Completed ]

Project Intro/Objective

The purpose of this project is create a classification model capable of accurately predicting the price of secondhand cars. The data used for model building is open source and has been added to this repository. Most packages used are usually pre-installed in most developed environments and tools like collab, jupyter, etc. This can be useful for people looking to enhance the way the code their predicitve models and efficient ways to deal with tabular data!

Methods Used

  • Inferential Statistics
  • Machine Learning
  • Feature Engineering
  • Predictive Modeling
  • Deep Learning
  • Data Visualization
  • Classification

Technologies

  • Python
  • Pandas, TensorFlow, SkLearn
  • Collab

Project Description

  • This Notebook is based off an open source dataset available on www.kaggle.com where I have created models to predict selling price of second hand cars on the basis of various parameters and attributes! The best score was 92.57% with the best MSE being around 4900
  • All models are subject to betterment with more stringent hyper-parameter tuning. This can be achieved by random selection, brute force methods, etc. Various other classifiers can also be used, but the most standard classifiers have been considered in this notebook.
  • Recommend standard practices for data transformation, outlier detection, and null value substitution have been incorporated in this notebook.
  • Good visualizations have also been shown in the notebook for explaining the importance and significance of certain parameters. It can be easily understood by people coming from non-technical backgrounds. Various parameter tuning and scaling methods are shown that helped me achieve enhanced results!
  • Recommend standard practices for data transformation, outlier detection, and null value substitution have been incorporated in this notebook.
  • This code has been UPVOTED by 10 People, Including Kaggle Grandmasters (Highly recognised people for their achievements in the data science Community). I have received a bronze medal for my code in the community.

Getting Started

One can simply download the notebook and dataset, open in platforms like Jupyter, Collab, and Run each cell to see results! This Python 3 environment comes with many helpful analytics libraries installed It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python For example, here's several helpful packages to load

import numpy as np # linear algebra import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

Input data files are available in the read-only "../input/" directory For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os for dirname, _, filenames in os.walk('/kaggle/input'): for filename in filenames: print(os.path.join(dirname, filename))

Contact

Owner
Akarsh Singh
Data Scientist, Grad Student, Avid Researcher in the domains of ML, Deep Learning, and Stats. In a nutshell, I enjoy transforming data into valuable knowledge!
Akarsh Singh
Machine learning model evaluation made easy: plots, tables, HTML reports, experiment tracking and Jupyter notebook analysis.

sklearn-evaluation Machine learning model evaluation made easy: plots, tables, HTML reports, experiment tracking, and Jupyter notebook analysis. Suppo

Eduardo Blancas 354 Dec 31, 2022
Machine learning template for projects based on sklearn library.

Machine learning template for projects based on sklearn library.

Janez Lapajne 17 Oct 28, 2022
Repository for DCA0305, an undergraduate course about Machine Learning Workflows and Pipelines

Federal University of Rio Grande do Norte Technology Center Department of Computer Engineering and Automation Machine Learning Based Systems Design Re

Ivanovitch Silva 81 Oct 18, 2022
Code Repository for Machine Learning with PyTorch and Scikit-Learn

Code Repository for Machine Learning with PyTorch and Scikit-Learn

Sebastian Raschka 1.4k Jan 03, 2023
ZenML 🙏: MLOps framework to create reproducible ML pipelines for production machine learning.

ZenML is an extensible, open-source MLOps framework to create production-ready machine learning pipelines. It has a simple, flexible syntax, is cloud and tool agnostic, and has interfaces/abstraction

ZenML 2.6k Jan 08, 2023
虚拟货币(BTC、ETH)炒币量化系统项目。在一版本的基础上加入了趋势判断

🎉 第二版本 🎉 (现货趋势网格) 介绍 在第一版本的基础上 趋势判断,不在固定点位开单,选择更优的开仓点位 优势: 🎉 简单易上手 安全(不用将api_secret告诉他人) 如何启动 修改app目录下的authorization文件

幸福村的码农 250 Jan 07, 2023
Model Agnostic Confidence Estimator (MACEST) - A Python library for calibrating Machine Learning models' confidence scores

Model Agnostic Confidence Estimator (MACEST) - A Python library for calibrating Machine Learning models' confidence scores

Oracle 95 Dec 28, 2022
Reproducibility and Replicability of Web Measurement Studies

Reproducibility and Replicability of Web Measurement Studies This repository holds additional material to the paper "Reproducibility and Replicability

6 Dec 31, 2022
A Tools that help Data Scientists and ML engineers train and deploy ML models.

Domino Research This repo contains projects under active development by the Domino R&D team. We build tools that help Data Scientists and ML engineers

Domino Data Lab 73 Oct 17, 2022
GRaNDPapA: Generator of Rad Names from Decent Paper Acronyms

Generator of Rad Names from Decent Paper Acronyms

264 Nov 08, 2022
In this Repo a simple Sklearn Model will be trained and pushed to MLFlow

SKlearn_to_MLFLow In this Repo a simple Sklearn Model will be trained and pushed to MLFlow Install This Repo is based on poetry python3 -m venv .venv

1 Dec 13, 2021
A classification model capable of accurately predicting the price of secondhand cars

The purpose of this project is create a classification model capable of accurately predicting the price of secondhand cars. The data used for model building is open source and has been added to this

Akarsh Singh 2 Sep 13, 2022
A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

pmdarima Pmdarima (originally pyramid-arima, for the anagram of 'py' + 'arima') is a statistical library designed to fill the void in Python's time se

alkaline-ml 1.3k Jan 06, 2023
fMRIprep Pipeline To Machine Learning

fMRIprep Pipeline To Machine Learning(Demo) 所有配置均在config.py文件下定义 前置环境(lilab) 各个节点均安装docker,并有fmripre的镜像 可以使用conda中的base环境(相应的第三份包之后更新) 1. fmriprep scr

Alien 3 Mar 08, 2022
CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

CML with cloud compute This repository contains a sample project using CML with Terraform (via the cml-runner function) to launch an AWS EC2 instance

Iterative 19 Oct 03, 2022
50% faster, 50% less RAM Machine Learning. Numba rewritten Sklearn. SVD, NNMF, PCA, LinearReg, RidgeReg, Randomized, Truncated SVD/PCA, CSR Matrices all 50+% faster

[Due to the time taken @ uni, work + hell breaking loose in my life, since things have calmed down a bit, will continue commiting!!!] [By the way, I'm

Daniel Han-Chen 1.4k Jan 01, 2023
Lightning ⚡️ fast forecasting with statistical and econometric models.

Nixtla Statistical ⚡️ Forecast Lightning fast forecasting with statistical and econometric models StatsForecast offers a collection of widely used uni

Nixtla 2.1k Dec 29, 2022
Painless Machine Learning for python based on scikit-learn

PlainML Painless Machine Learning Library for python based on scikit-learn. Install pip install plainml Example from plainml import KnnModel, load_ir

1 Aug 06, 2022
Transform ML models into a native code with zero dependencies

m2cgen (Model 2 Code Generator) - is a lightweight library which provides an easy way to transpile trained statistical models into a native code

Bayes' Witnesses 2.3k Jan 03, 2023
pandas, scikit-learn, xgboost and seaborn integration

pandas, scikit-learn and xgboost integration.

299 Dec 30, 2022