Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations.

Last update: Nov 04, 2022

Related tags

Data Analysis elicited

Overview

Elicited

Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations.

Credit to Brett Hoover, packaging by @magoo

Usage

pip install elicited

import elicited as e

elicited is just a helper tool when using numpy and scipy, so you'll need these in your code.

import numpy as np
from scipy.stats import poisson, zipf, beta, pareto, lognorm

Lognormal

See Occurance and Applications for examples of lognormal distributions in nature.

Expert: Most customers hold around $20K (mode) but I could imagine a customer with $2.5M (max)

mode = 20000
max = 2500000

mean, stdv = e.elicitLogNormal(mode, max)
asset_values = lognorm(s=stdv, scale=np.exp(mean))
asset_values.rvs(100)

Pareto

The 80/20 rule. See Occurance and Applications

Expert: The legal costs of an incident could be devastating. Typically costs are almost zero (val_min) but a black swan could be $100M (val_max).

b = e.elicitPareto(val_min, val_max)
p = pareto(b, loc=val_min-1., scale=1.))

PERT

See PERT Distribution

Expert: Our customers have anywhere from $500-$6000 (val_min / val_max), but it's most typically around $4500 (val_mod)

PERT_a, PERT_b = e.elicitPERT(val_min, val_mod, val_max)
pert = beta(PERT_a, PERT_b, loc=val_min, scale=val_max-val_min)

Zipf's

See Applications

Expert: If we get sued, there will only be a few litigants (nMin). Very rarely it could be 30 or more litigants (nMax), maybe once every thousand cases (pMax) it would be more.

nMin = 1
nMax = 30
pMax = 1/1000

Zs = e.elicitZipf(nMin, nMax, pMax, report=True)

litigants = zipf(Zs, nMin-1)

litigants.rvs(100)

Reference: Other Useful Elicitations

Listed as a courtesy, these distributions are simple enough to elicit data into directly without a helper function.

Uniform

A "zero knowledge" distribution where all values within the range have equal probability of appearing. Similar to random.randint(a, b)

Expert: The crowd will be between 50 (min) and 500 (max) due to fire code restrictions and the existing residents in the building.

from scipy.stats import uniform

min = 50
max = 500

range = max - min

crowd_size = uniform(min, range)
crowd_size.rvs(100)

Poisson

Expert: About 3000 Customers (average) add a credit card to their account every quarter.

from scipy.stats import poisson
average = 3000
upsells = poisson(average)
upsells.rvs(100)

Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations.

Related tags

Overview

Elicited

Usage

Lognormal

Pareto

PERT

Zipf's

Reference: Other Useful Elicitations

Uniform

Poisson

Owner

Ryan McGeehan

This python script allows you to manipulate the audience data from Sl.ido surveys

DataPrep — The easiest way to prepare data in Python

A meta plugin for processing timelapse data timepoint by timepoint in napari

Manage large and heterogeneous data spaces on the file system.

Vaex library for Big Data Analytics of an Airline dataset

General Assembly's 2015 Data Science course in Washington, DC

Clean and reusable data-sciency notebooks.

pyETT: Python library for Eleven VR Table Tennis data

ForecastGA is a Python tool to forecast Google Analytics data using several popular time series models.

Generates a simple report about the current Covid-19 cases and deaths in Malaysia

Intercepting proxy + analysis toolkit for Second Life compatible virtual worlds

A tax calculator for stocks and dividends activities.

An Aspiring Drop-In Replacement for NumPy at Scale

nrgpy is the Python package for processing NRG Data Files

Convert tables stored as images to an usable .csv file

Finds, downloads, parses, and standardizes public bikeshare data into a standard pandas dataframe format

Exploring the Top ML and DL GitHub Repositories

This cosmetics generator allows you to generate the new Fortnite cosmetics, Search pak and search cosmetics!

Port of dplyr and other related R packages in python, using pipda.

Exploratory Data Analysis for Employee Retention Dataset