Tools for calculating and visualizing Elo-like ratings of MLB teams using Retosheet data

Overview

Overview

This project uses historical baseball games data to calculate an Elo-like rating for MLB teams based on regular season match ups. The Elo rating system was originally developed for ranking chess players but also can be applied to other individual and team sports. In particular it works well for measuring performance of baseball teams because of the large number of games played per season.

Wikipedia has more technical details on the rating system: https://en.m.wikipedia.org/wiki/Elo_rating_system

A similar analysis was done by 538: https://projects.fivethirtyeight.com/complete-history-of-mlb/

This repo consists of three primary pieces of code:

  • Parser.py imports and cleans the game-level data. It performs basic calculations, such as determining game winners.
  • Elo.py calculates Elo ratings for a given set of games data
  • Visualization.R plots the output in a variety of interesting ways

Data

Source: https://www.retrosheet.org/gamelogs/index.html

This data was compiled by Retosheet.org. The only stipulation for use of this data is prominent display of this statement:

The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at "www.retrosheet.org".

The formats of the data can be found in the Formats.py module. The Parser.py script imports the annual files, applies some cleaning and formatting, and stores the consolidated data in a SQLite database for later use.

Examples

This graph shows the distribution of each team's Elo rating over the course of the decade 2010-2019. The ratings are weighted by in-season days.

Distribution of Elo Ratings

This graph shows the progress of the 5 teams in the AL West over the 5-year period starting in 2014.

American League West Ratings 2014 - 2019

Each team was also ranked by their Elo relative to the rest of the league on each day. The graph below shows the time spent at each rank for the 5 teams in the AL West. The gradient indicates the years that those ranks occured.

American League West Ranks 2010 - 2019

Future improvements

  • Finish tuning parameters for best model performance
  • Expand analysis to earlier time periods (requires handling of teams entering/leaving league)
Owner
Lukas Owens
Lukas Owens
A programming language built on top of Python to easily allow Swahili speakers to get started with programming without ever knowing English

pyswahili A programming language built over Python to easily allow swahili speakers to get started with programming without ever knowing english pyswa

Jordan Kalebu 72 Dec 15, 2022
Ana's Portfolio

Ana's Portfolio ✌️ Welcome to my Portfolio! You will find here different Projects I have worked on (from scratch) 💪 Projects 💻 1️⃣ Hangman game (Mad

Ana Katherine Cortes Sobrino 9 Mar 15, 2022
A minimalistic wrapper around PyOpenGL to save development time

glpy glpy is pyOpenGl wrapper which lets you work with pyOpenGl easily.It is not meant to be a replacement for pyOpenGl but runs on top of pyOpenGl to

Abhinav 9 Apr 02, 2022
Visualize data of Vietnam's regions with interactive maps.

Plotting Vietnam Development Map This is my personal project that I use plotly to analyse and visualize data of Vietnam's regions with interactive map

1 Jun 26, 2022
The plottify package is makes matplotlib plots more legible

plottify The plottify package is makes matplotlib plots more legible. It's a thin wrapper around matplotlib that automatically adjusts font sizes, sca

Andy Jones 97 Nov 04, 2022
Python wrapper for Synoptic Data API. Retrieve data from thousands of mesonet stations and networks. Returns JSON from Synoptic as Pandas DataFrame

☁ Synoptic API for Python (unofficial) The Synoptic Mesonet API (formerly MesoWest) gives you access to real-time and historical surface-based weather

Brian Blaylock 23 Jan 06, 2023
Colormaps for astronomers

cmastro: colormaps for astronomers 🔭 This package contains custom colormaps that have been used in various astronomical applications, similar to cmoc

Adrian Price-Whelan 12 Oct 11, 2022
Draw tree diagrams from indented text input

Draw tree diagrams This repository contains two very different scripts to produce hierarchical tree diagrams like this one: $ ./classtree.py collectio

Luciano Ramalho 8 Dec 14, 2022
A simple agent-based model used to teach the basics of OOP in my lectures

Pydemic A simple agent-based model of a pandemic. This is used to teach basic principles of object-oriented programming to master students. It is not

Fabien Maussion 2 Jun 08, 2022
Missing data visualization module for Python.

missingno Messy datasets? Missing values? missingno provides a small toolset of flexible and easy-to-use missing data visualizations and utilities tha

Aleksey Bilogur 3.4k Dec 29, 2022
A guide for using Bootstrap 5 classes in Dash Bootstrap Components V1

dash-bootstrap-cheatsheet This handy interactive cheatsheet makes it easy to use the Bootstrap 5 classes with your Dash app made with the latest versi

10 Dec 22, 2022
Rockstar - Makes you a Rockstar C++ Programmer in 2 minutes

Rockstar Rockstar is one amazing library, which will make you a Rockstar Programmer in just 2 minutes. In last decade, people learned C++ in 21 days.

4k Jan 05, 2023
Python toolkit for defining+simulating+visualizing+analyzing attractors, dynamical systems, iterated function systems, roulette curves, and more

Attractors A small module that provides functions and classes for very efficient simulation and rendering of iterated function systems; dynamical syst

1 Aug 04, 2021
Gallery of applications built using bqplot and widget libraries like ipywidgets, ipydatagrid etc.

bqplot Gallery This is a gallery of bqplot examples. View the gallery at https://bqplot.github.io/bqplot-gallery. Contributing new examples Clone this

8 Aug 23, 2022
Homework 2: Matplotlib and Data Visualization

Homework 2: Matplotlib and Data Visualization Overview These data visualizations were created for my introductory computer science course using Python

Sophia Huang 12 Oct 20, 2022
PanGraphViewer -- show panenome graph in an easy way

PanGraphViewer -- show panenome graph in an easy way Table of Contents Versions and dependences Desktop-based panGraphViewer Library installation for

16 Dec 17, 2022
A shimmer pre-load component for Plotly Dash

dash-loading-shimmer A shimmer pre-load component for Plotly Dash Installation Get it with pip: pip install dash-loading-extras Or maybe you prefer Pi

Lucas Durand 4 Oct 12, 2022
Lightweight data validation and adaptation Python library.

Valideer Lightweight data validation and adaptation library for Python. At a Glance: Supports both validation (check if a value is valid) and adaptati

Podio 258 Nov 22, 2022
EPViz is a tool to aid researchers in developing, validating, and reporting their predictive modeling outputs.

EPViz (EEG Prediction Visualizer) EPViz is a tool to aid researchers in developing, validating, and reporting their predictive modeling outputs. A lig

Jeff 2 Oct 19, 2022
Visualize your pandas data with one-line code

PandasEcharts 简介 基于pandas和pyecharts的可视化工具 安装 pip 安装 $ pip install pandasecharts 源码安装 $ git clone https://github.com/gamersover/pandasecharts $ cd pand

陈华杰 2 Apr 13, 2022