ForecastGA is a Python tool to forecast Google Analytics data using several popular time series models.

Overview

ForecastGA

A Python tool to forecast GA data using several popular time series models.

Open In Colab

Logo for ForecastGA

About

Welcome to ForecastGA

ForecastGA is a tool that combines a couple of popular libraries, Atspy and googleanalytics, with a few enhancements.

  • The models are made more intuitive to upgrade and add by having the tool logic separate from the model training and prediction.
  • When calling am.forecast_insample(), any kwargs included (e.g. learning_rate) are passed to the train method of the model.
  • Google Analytics profiles are specified by simply passing the URL (e.g. https://analytics.google.com/analytics/web/?authuser=2#/report-home/aXXXXXwXXXXXpXXXXXX).
  • You can provide a data dict with GA config options or a Pandas Series as the input data.
  • Multiple log levels.
  • Auto GPU detection (via Torch).
  • List all available models, with descriptions, by calling forecastga.print_model_info().
  • Google API info can be passed in the data dict or uploaded as a JSON file named identity.json.
  • Created a companion Google Colab notebook to easily run on GPU.
  • A handy plot function for Colab, forecastga.plot_colab(forecast_in, title="Insample Forecast", dark_mode=True) that formats nicely and also handles Dark Mode!

Models Available

  • ARIMA : Automated ARIMA Modelling
  • Prophet : Modeling Multiple Seasonality With Linear or Non-linear Growth
  • ProphetBC : Prophet Model with Box-Cox transform of the data
  • HWAAS : Exponential Smoothing With Additive Trend and Additive Seasonality
  • HWAMS : Exponential Smoothing with Additive Trend and Multiplicative Seasonality
  • NBEATS : Neural basis expansion analysis (now fixed at 20 Epochs)
  • Gluonts : RNN-based Model (now fixed at 20 Epochs)
  • TATS : Seasonal and Trend no Box Cox
  • TBAT : Trend and Box Cox
  • TBATS1 : Trend, Seasonal (one), and Box Cox
  • TBATP1 : TBATS1 but Seasonal Inference is Hardcoded by Periodicity
  • TBATS2 : TBATS1 With Two Seasonal Periods

How To Use

Find Model Info:

forecastga.print_model_info()

Initialize Model:

Google Analytics:
data = { 'client_id': '',
         'client_secret': '',
         'identity': '',
         'ga_start_date': '2018-01-01',
         'ga_end_date': '2019-12-31',
         'ga_metric': 'sessions',
         'ga_segment': 'organic traffic',
         'ga_url': 'https://analytics.google.com/analytics/web/?authuser=2#/report-home/aXXXXXwXXXXXpXXXXXX',
         'omit_values_over': 2000000
        }

model_list = ["TATS", "TBATS1", "TBATP1", "TBATS2", "ARIMA"]
am = forecastga.AutomatedModel(data , model_list=model_list, forecast_len=30 )
Pandas DataFrame:
# CSV with columns: Date and Sessions
df = pd.read_csv('ga_sessions.csv')
df.Date = pd.to_datetime(df.Date)
df = df.set_index("Date")
data = df.Sessions

model_list = ["TATS", "TBATS1", "TBATP1", "TBATS2", "ARIMA"]
am = forecastga.AutomatedModel(data , model_list=model_list, forecast_len=30 )

Forecast Insample:

forecast_in, performance = am.forecast_insample()

Forecast Outsample:

forecast_out = am.forecast_outsample()

Ensemble Performance:

all_ensemble_in, all_ensemble_out, all_performance = am.ensemble(forecast_in, forecast_out)

Pretty Plot in Google Colab

forecastga.plot_colab(forecast_in, title="Insample Forecast", dark_mode=True)

Installation

Windows users may need to manually install the two items below via conda :

  1. conda install pystan
  2. conda install pytorch -c pytorch
  3. !pip install --upgrade git+https://github.com/jroakes/ForecastGA.git

otherwise, pip install --upgrade forecastga

This repo support GPU training. Below are a few libraries that may have to be manually installed to support.

pip install --upgrade mxnet-cu101
pip install --upgrade torch 1.7.0+cu101

Acknowledgements

  1. Majority of forecasting code taken from https://github.com/firmai/atspy and refactored heavily.
  2. Google Analytics based off of: https://github.com/debrouwere/google-analytics
  3. Thanks to richardfergie for the addition of the Prophet Box-Cox model to control negative predictions.

Contribute

The goal of this repo is to grow the list of available models to test. If you would like to contribute one please read on. Feel free to have fun naming your models.

  1. Fork the repo.
  2. In the /src/forecastga/models folder there is a model called template.py. You can use this as a template for creating your new model. All available variables are there. Forecastga ensures each model has the right data and calls only the train and forecast methods for each model. Feel free to add additional methods that your model requires.
  3. Edit the /src/forecastga/models/__init__.py file to add your model's information. Follow the format of the other entries. Forecastga relies on loc to find the model and class to find the class to use.
  4. Edit requirments.txt with any additional libraries needed to run your model. Keep in mind that this repo should support GPU training if available and some libraries have separate GPU-enabled versions.
  5. Issue a pull request.

If you enjoyed this tool consider buying me some beer at: Paypalme

Owner
JR Oakes
Hacker, SEO, NC State fan, co-organizer of Raleigh and RTP Meetups, as well as @sengineland author. Tweets are my own.
JR Oakes
The Master's in Data Science Program run by the Faculty of Mathematics and Information Science

The Master's in Data Science Program run by the Faculty of Mathematics and Information Science is among the first European programs in Data Science and is fully focused on data engineering and data a

Amir Ali 2 Jun 17, 2022
Flood modeling by 2D shallow water equation

hydraulicmodel Flood modeling by 2D shallow water equation. Refer to Hunter et al (2005), Bates et al. (2010). Diffusive wave approximation Local iner

6 Nov 30, 2022
Evidence enables analysts to deliver a polished business intelligence system using SQL and markdown.

Evidence enables analysts to deliver a polished business intelligence system using SQL and markdown

915 Dec 26, 2022
Python Practicum - prepare for your Data Science interview or get a refresher.

Python-Practicum Python Practicum - prepare for your Data Science interview or get a refresher. Data Data visualization using data on births from the

Jovan Trajceski 1 Jul 27, 2021
PyClustering is a Python, C++ data mining library.

pyclustering is a Python, C++ data mining library (clustering algorithm, oscillatory networks, neural networks). The library provides Python and C++ implementations (C++ pyclustering library) of each

Andrei Novikov 1k Jan 05, 2023
A collection of robust and fast processing tools for parsing and analyzing web archive data.

ChatNoir Resiliparse A collection of robust and fast processing tools for parsing and analyzing web archive data. Resiliparse is part of the ChatNoir

ChatNoir 24 Nov 29, 2022
PyChemia, Python Framework for Materials Discovery and Design

PyChemia, Python Framework for Materials Discovery and Design PyChemia is an open-source Python Library for materials structural search. The purpose o

Materials Discovery Group 61 Oct 02, 2022
Candlestick Pattern Recognition with Python and TA-Lib

Candlestick-Pattern-Recognition-with-Python-and-TA-Lib Goal Look at the S&P500 to try and get a better understanding of these candlestick patterns and

Ganesh Jainarain 11 Oct 07, 2022
Airflow ETL With EKS EFS Sagemaker

Airflow ETL With EKS EFS & Sagemaker (en desarrollo) Diagrama de la solución Imp

1 Feb 14, 2022
Lale is a Python library for semi-automated data science.

Lale is a Python library for semi-automated data science. Lale makes it easy to automatically select algorithms and tune hyperparameters of pipelines that are compatible with scikit-learn, in a type-

International Business Machines 293 Dec 29, 2022
Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python

Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python 📊

Thomas 2 May 26, 2022
This cosmetics generator allows you to generate the new Fortnite cosmetics, Search pak and search cosmetics!

COSMETICS GENERATOR This cosmetics generator allows you to generate the new Fortnite cosmetics, Search pak and search cosmetics! Remember to put the l

ᴅᴊʟᴏʀ3xᴢᴏ 11 Dec 13, 2022
Python ELT Studio, an application for building ELT (and ETL) data flows.

The Python Extract, Load, Transform Studio is an application for performing ELT (and ETL) tasks. Under the hood the application consists of a two parts.

Schlerp 55 Nov 18, 2022
A computer algebra system written in pure Python

SymPy See the AUTHORS file for the list of authors. And many more people helped on the SymPy mailing list, reported bugs, helped organize SymPy's part

SymPy 9.9k Dec 31, 2022
This mini project showcase how to build and debug Apache Spark application using Python

Spark app can't be debugged using normal procedure. This mini project showcase how to build and debug Apache Spark application using Python programming language. There are also options to run Spark a

Denny Imanuel 1 Dec 29, 2021
API>local_db>AWS_RDS - Disclaimer! All data used is for educational purposes only.

APIlocal_dbAWS_RDS Disclaimer! All data used is for educational purposes only. ETL pipeline diagram. Aim of project By creating a fully working pipe

0 Apr 25, 2022
statDistros is a Python library for dealing with various statistical distributions

StatisticalDistributions statDistros statDistros is a Python library for dealing with various statistical distributions. Now it provides various stati

1 Oct 03, 2021
pipeline for migrating lichess data into postgresql

How Long Does It Take Ordinary People To "Get Good" At Chess? TL;DR: According to 5.5 years of data from 2.3 million players and 450 million games, mo

Joseph Wong 182 Nov 11, 2022
A model checker for verifying properties in epistemic models

Epistemic Model Checker This is a model checker for verifying properties in epistemic models. The goal of the model checker is to check for Pluralisti

Thomas Träff 2 Dec 22, 2021
Common bioinformatics database construction

biodb Common bioinformatics database construction 1.taxonomy (Substance classification database) Download the database wget -c https://ftp.ncbi.nlm.ni

sy520 2 Jan 04, 2022