Jupyter notebooks for the book "The Elements of Statistical Learning".

Owner

Madiyar

Pypeln is a simple yet powerful Python library for creating concurrent data pipelines.

Pypeln Pypeln (pronounced as "pypeline") is a simple yet powerful Python library for creating concurrent data pipelines. Main Features Simple: Pypeln

1.4k Dec 31, 2022

Retail-Sim is python package to easily create synthetic dataset of retaile store.

Retailer's Sale Data Simulation Retail-Sim is python package to easily create synthetic dataset of retaile store. Simulation Model Simulator consists

7 Sep 30, 2022

An orchestration platform for the development, production, and observation of data assets.

Dagster An orchestration platform for the development, production, and observation of data assets. Dagster lets you define jobs in terms of the data f

6.2k Jan 08, 2023

Pandas-based utility to calculate weighted means, medians, distributions, standard deviations, and more.

weightedcalcs weightedcalcs is a pandas-based Python library for calculating weighted means, medians, standard deviations, and more. Features Plays we

98 Dec 31, 2022

Retentioneering: product analytics, data-driven customer journey map optimization, marketing analytics, web analytics, transaction analytics, graph visualization, and behavioral segmentation with customer segments in Python.

What is Retentioneering? Retentioneering is a Python framework and library to assist product analysts and marketing analysts as it makes it easier to

581 Jan 07, 2023

INF42 - Topological Data Analysis

TDA INF421(Conception et analyse d'algorithmes) Projet : Topological Data Analysis SphereMin Etant donné un nuage des points, ce programme contient de

2 Jan 07, 2022

A variant of LinUCB bandit algorithm with local differential privacy guarantee

Contents LDP LinUCB Description Model Architecture Dataset Environment Requirements Script Description Script and Sample Code Script Parameters Launch

4 Oct 25, 2022

An Integrated Experimental Platform for time series data anomaly detection.

Curve Sorry to tell contributors and users. We decided to archive the project temporarily due to the employee work plan of collaborators. There are no

486 Dec 21, 2022

The repo for mlbtradetrees.com. Analyze any trade in baseball history!

7 Nov 20, 2022

MIR Cheatsheet - Survival Guidebook for MIR Researchers in the Lab

3 Jul 02, 2022

Unsub is a collection analysis tool that assists libraries in analyzing their journal subscriptions.

About Unsub is a collection analysis tool that assists libraries in analyzing their journal subscriptions. The tool provides rich data and a summary g

9 Nov 16, 2022

My solution to the book A Collection of Data Science Take-Home Challenges

DS-Take-Home Solution to the book "A Collection of Data Science Take-Home Challenges". Note: Please don't contact me for the dataset. This repository

1.5k Jan 03, 2023

Nobel Data Analysis

Nobel_Data_Analysis This project is for analyzing a set of data about people who have won the Nobel Prize in different fields and different countries

1 Jan 24, 2022

Example Of Splunk Search Query With Python And Splunk Python SDK

SSQAuto (Splunk Search Query Automation) Example Of Splunk Search Query With Python And Splunk Python SDK installation: ➜ ~ git clone https://github.c

1 Nov 14, 2021

Show you how to integrate Zeppelin with Airflow

Introduction This repository is to show you how to integrate Zeppelin with Airflow. The philosophy behind the ingtegration is to make the transition f

11 Dec 30, 2022

Analyzing Covid-19 Outbreaks in Ontario

My group and I took Covid-19 outbreak statistics from ontario, and analyzed them to find different patterns and future predictions for the virus

0 Jan 20, 2022

Building house price data pipelines with Apache Beam and Spark on GCP

This project contains the process from building a web crawler to extract the raw data of house price to create ETL pipelines using Google Could Platform services.

1 Nov 22, 2021

Python scripts aim to use a Random Forest machine learning algorithm to predict the water affinity of Metal-Organic Frameworks

The following Python scripts aim to use a Random Forest machine learning algorithm to predict the water affinity of Metal-Organic Frameworks (MOFs). The training set is extracted from the Cambridge S

1 Jan 09, 2022

Snakemake workflow for converting FASTQ files to self-contained CRAM files with maximum lossless compression.

Snakemake workflow: name A Snakemake workflow for description Usage The usage of this workflow is described in the Snakemake Workflow Catalog. If

1 Dec 16, 2021

PyIOmica (pyiomica) is a Python package for omics analyses.

PyIOmica (pyiomica) This repository contains PyIOmica, a Python package that provides bioinformatics utilities for analyzing (dynamic) omics datasets.

13 Jun 29, 2022

Jupyter notebooks for the book "The Elements of Statistical Learning".

Related tags

Overview

Jupyter notebooks for the book "The Elements of Statistical Learning".

Requirements

Table of Contents

Owner

Madiyar

Pypeln is a simple yet powerful Python library for creating concurrent data pipelines.

Retail-Sim is python package to easily create synthetic dataset of retaile store.

An orchestration platform for the development, production, and observation of data assets.

Pandas-based utility to calculate weighted means, medians, distributions, standard deviations, and more.

Retentioneering: product analytics, data-driven customer journey map optimization, marketing analytics, web analytics, transaction analytics, graph visualization, and behavioral segmentation with customer segments in Python.

INF42 - Topological Data Analysis

A variant of LinUCB bandit algorithm with local differential privacy guarantee

An Integrated Experimental Platform for time series data anomaly detection.

The repo for mlbtradetrees.com. Analyze any trade in baseball history!

MIR Cheatsheet - Survival Guidebook for MIR Researchers in the Lab

Unsub is a collection analysis tool that assists libraries in analyzing their journal subscriptions.

My solution to the book A Collection of Data Science Take-Home Challenges

Nobel Data Analysis

Example Of Splunk Search Query With Python And Splunk Python SDK

Show you how to integrate Zeppelin with Airflow

Analyzing Covid-19 Outbreaks in Ontario

Building house price data pipelines with Apache Beam and Spark on GCP

Python scripts aim to use a Random Forest machine learning algorithm to predict the water affinity of Metal-Organic Frameworks

Snakemake workflow for converting FASTQ files to self-contained CRAM files with maximum lossless compression.

PyIOmica (pyiomica) is a Python package for omics analyses.