take home quiz

Overview

guess the correlation

data inspection

a pretty normal distribution

dist

train/val/test split

splitting amount

.dataset:                150000 instances
├─80%─├─80%─training      96000 instances
│     └─20%─validation    24000 instances
├─20%─testing             30000 instances

after a rough glance at the dataset distribution, considered the dataset is pretty normal distributed and has enough instances to keep the variance low after 80/20 splitting.

splitting method

def _split_dataset(self, split, training=True):
    if split == 0.0:
        return None, None

    # self.correlations_frame = pd.read_csv('path/to/csv_file')
    n_samples = len(self.correlations_frame)

    idx_full = np.arange(n_samples)

    # fix seed for referenceable testing set
    np.random.seed(0)
    np.random.shuffle(idx_full)

    if isinstance(split, int):
        assert split > 0
        assert split < n_samples, "testing set size is configured to be larger than entire dataset."
        len_test = split
    else:
        len_test = int(n_samples * split)

    test_idx = idx_full[0:len_test]
    train_idx = np.delete(idx_full, np.arange(0, len_test))

    if training:
        dataset = self.correlations_frame.ix[train_idx]
    else:
        dataset = self.correlations_frame.ix[test_idx]

    return dataset

training/validation splitting uses the same logic

model inspection

CorrelationModel(
  (features): Sequential(
    (0): Conv2d(1, 16, kernel_size=(3, 3), stride=(2, 2), padding=(2, 2))
    #(0): params: (3*3*1+1) * 16 = 160
    (1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    #(1): params: 16 * 2 = 32
    (2): ReLU(inplace=True)
    (3): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (4): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2))
    #(4): params: (3*3*16+1) * 32 = 4640
    (5): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    #(5): params: 32 * 2 = 64
    (6): ReLU(inplace=True)
    (7): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (8): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    #(8): params: (3*3*32+1) * 64 = 18496
    (9): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    #(9): params: 64 * 2 = 128
    (10): ReLU(inplace=True)
    (11): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (12): Conv2d(64, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    #(12): params: (3*3*64+1) * 32 = 18464
    (13): ReLU(inplace=True)
    (14): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (15): Conv2d(32, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (#15): params: (3*3*32+1) * 16 = 4624
    (16): ReLU(inplace=True)
    (17): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (18): Conv2d(16, 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (#18): params: (3*3*16+1) * 8 = 1160
    (19): ReLU(inplace=True)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
  (linear): Sequential(
    (0): Conv2d(8, 1, kernel_size=(1, 1), stride=(1, 1))
    #(0): params: (8+1) * 1 = 9
    (1): Tanh()
  )
)
Trainable parameters: 47777

loss function

the loss function of choice is smooth_l1, which has the advantages of both l1 and l2 loss

def SmoothL1(yhat, y):                                                  <--- final choice
    return torch.nn.functional.smooth_l1_loss(yhat, y)

def MSELoss(yhat, y):
    return torch.nn.functional.mse_loss(yhat, y)

def RMSELoss(yhat, y):
    return torch.sqrt(MSELoss(yhat, y))

def MSLELoss(yhat, y):
    return MSELoss(torch.log(yhat + 1), torch.log(y + 1))

def RMSLELoss(yhat, y):
    return torch.sqrt(MSELoss(torch.log(yhat + 1), torch.log(y + 1)))

evaluation metric

def mse(output, target):
    # mean square error
    with torch.no_grad():
        assert output.shape[0] == len(target)
        mae = torch.sum(MSELoss(output, target)).item()
    return mae / len(target)

def mae(output, target):
    # mean absolute error
    with torch.no_grad():
        assert output.shape[0] == len(target)
        mae = torch.sum(abs(target-output)).item()
    return mae / len(target)

def mape(output, target):
    # mean absolute percentage error
    with torch.no_grad():
        assert output.shape[0] == len(target)
        mape = torch.sum(abs((target-output)/target)).item()
    return mape / len(target)

def rmse(output, target):
    # root mean square error
    with torch.no_grad():
        assert output.shape[0] == len(target)
        rmse = torch.sum(torch.sqrt(MSELoss(output, target))).item()
    return rmse / len(target)

def msle(output, target):
    # mean square log error
    with torch.no_grad():
        assert output.shape[0] == len(target)
        msle = torch.sum(MSELoss(torch.log(output + 1), torch.log(target + 1))).item()
    return msle / len(target)

def rmsle(output, target):
    # root mean square log error
    with torch.no_grad():
        assert output.shape[0] == len(target)
        rmsle = torch.sum(torch.sqrt(MSELoss(torch.log(output + 1), torch.log(target + 1)))).item()
    return rmsle / len(target)

training result

trainer - INFO -     epoch          : 1
trainer - INFO -     smooth_l1loss  : 0.0029358651146370296
trainer - INFO -     mse            : 9.174910654958997e-05
trainer - INFO -     mae            : 0.04508562459920844
trainer - INFO -     mape           : 0.6447089369893074
trainer - INFO -     rmse           : 0.0008826211761528006
trainer - INFO -     msle           : 0.0002885178522810747
trainer - INFO -     rmsle          : 0.0016459243478796756
trainer - INFO -     val_loss       : 0.000569225614812846
trainer - INFO -     val_mse        : 1.7788300462901436e-05
trainer - INFO -     val_mae        : 0.026543946107228596
trainer - INFO -     val_mape       : 0.48582320946455004
trainer - INFO -     val_rmse       : 0.0005245986936303476
trainer - INFO -     val_msle       : 9.091730712680146e-05
trainer - INFO -     val_rmsle      : 0.0009993902465794235
                    .
                    .
                    .
                    .
                    .
                    .
trainer - INFO -     epoch          : 7                           <--- final model
trainer - INFO -     smooth_l1loss  : 0.00017805844737449661
trainer - INFO -     mse            : 5.564326480453019e-06
trainer - INFO -     mae            : 0.01469234253714482
trainer - INFO -     mape           : 0.2645472921580076
trainer - INFO -     rmse           : 0.0002925463738307978
trainer - INFO -     msle           : 3.3151906652316634e-05
trainer - INFO -     rmsle          : 0.0005688522928685416
trainer - INFO -     val_loss       : 0.00017794455110561102
trainer - INFO -     val_mse        : 5.560767222050344e-06
trainer - INFO -     val_mae        : 0.014510956528286139
trainer - INFO -     val_mape       : 0.25059283276398975
trainer - INFO -     val_rmse       : 0.0002930224982944007
trainer - INFO -     val_msle       : 3.403802761204133e-05
trainer - INFO -     val_rmsle      : 0.0005525556141122554
trainer - INFO - Saving checkpoint: saved/models/correlation/1031_043742/checkpoint-epoch7.pth ...
trainer - INFO - Saving current best: model_best.pth ...
                    .
                    .
                    .
                    .
                    .
                    .
trainer - INFO -     epoch          : 10                           <--- early stop
trainer - INFO -     smooth_l1loss  : 0.00014610137016279624
trainer - INFO -     mse            : 4.565667817587382e-06
trainer - INFO -     mae            : 0.013266990386570494
trainer - INFO -     mape           : 0.24146838792661826
trainer - INFO -     rmse           : 0.00026499629460158757
trainer - INFO -     msle           : 2.77259079665176e-05
trainer - INFO -     rmsle          : 0.0005148174095957074
trainer - INFO -     val_loss       : 0.00018394086218904705
trainer - INFO -     val_mse        : 5.74815194340772e-06
trainer - INFO -     val_mae        : 0.01494487459709247
trainer - INFO -     val_mape       : 0.27262411576509477
trainer - INFO -     val_rmse       : 0.0002979971170425415
trainer - INFO -     val_msle       : 3.1850282267744966e-05
trainer - INFO -     val_rmsle      : 0.0005451643197642019
trainer - INFO - Validation performance didn't improve for 2 epochs. Training stops.

loss graph dist

testing result

Loading checkpoint: saved/models/correlation/model_best.pth ...
Done
Testing set samples: 30000
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 59/59 [00:19<00:00,  3.04it/s]
Testing result:
{'loss': 0.0001722179292468354, 'mse': 6.77461177110672e-07, 'mae': 0.014289384969075522, 'mape': 0.2813985677083333, 'rmse': 3.6473782857259115e-05, 'msle': 3.554690380891164e-06, 'rmsle': 7.881066799163819e-05}
Owner
HR Wu
HR Wu
Developing and Comparing Vision-based Algorithms for Vision-based Agile Flight

DodgeDrone: Vision-based Agile Drone Flight (ICRA 2022 Competition) Would you like to push the boundaries of drone navigation? Then participate in the

Robotics and Perception Group 115 Dec 10, 2022
python based clash stars made by grade 7 and 5

clash_stars python based clash stars made by grade 7 and 5 How to play: PLAYER ONE (LEFT PLAYER) Move: W,A,S,D Shoot: SHIFT PLAYER TWO (RIGHT PLAYER)

5 Oct 22, 2021
Interpreting-compiling programming language.

HoneyASM The programming language written on Python, which can be as interpreted as compiled. HoneyASM is easy for use very optimized PL, which can so

TalismanChet 1 Dec 25, 2021
Blender Add-on to Add Metal Materials to Your Scene

Blender QMM (Quick Metal Materials) Blender Addon to Add Metal Materials to Your Scene Installation Download the latest ZIP from Releases. Usage This

Don Schnitzius 27 Dec 26, 2022
EFB Docker image with efb-telegram-master and efb-wechat-slave

efb-wechat-docker EFB Docker image with efb-telegram-master and efb-wechat-slave Features Container run by non-root user. Support add environment vari

Haukeng 1 Nov 10, 2022
Nateve transpiler developed with python.

Adam Adam is a Nateve Programming Language transpiler developed using Python. Nateve Nateve is a new general domain programming language open source i

Nateve 7 Jan 15, 2022
An account generator for guilded.gg that I made a while back and decided to bring back up

An account generator for guilded.gg that I made a while back and decided to bring back up

8 Nov 17, 2022
Stori QA Automation Challenge

Stori-QA-Automation-Challenge This is the repository is created for the Stori QA Intern Automation Engineer Challenge! In this you can find the Requir

Daniel Castañeda 0 Feb 20, 2022
Package pyVHR is a comprehensive framework for studying methods of pulse rate estimation relying on remote photoplethysmography (rPPG)

Package pyVHR (short for Python framework for Virtual Heart Rate) is a comprehensive framework for studying methods of pulse rate estimation relying on remote photoplethysmography (rPPG)

PHUSE Lab 261 Jan 03, 2023
This is the course repository for the Spring 2022 iteration of MACS 30123 "Large-Scale Computing for the Social Sciences" at the University of Chicago.

Large-Scale Computing for the Social Sciences Spring 2022 - MACS 30123/MAPS 30123/PLSC 30123 Instructor Information TA Information TA Information Cour

6 May 06, 2022
Ssma is a tool that helps you collect your badges in a satr platform

satr-statistics-maker ssma is a tool that helps you collect your badges in a satr platform 🎖️ Requirements python = 3.7 Installation first clone the

TheAwiteb 3 Jan 04, 2022
freeCodeCamp Scientific Computing with Python Project for Certification.

Time_Calculator_freeCodeCamp freeCodeCamp Scientific Computing with Python Project for Certification. Write a function named add_time that takes in tw

Rajdeep Mondal 1 Dec 23, 2021
适用于HoshinoBot下的人生重来模拟器插件

LifeRestart for HoshinoBot 原作地址 python版原地址 本项目地址 安装方法 这是一个HoshinoBot的人生重来模拟器插件 这个项目使用的HoshinoBot的消息触发器,如果你了解其他机器人框架的api(比如nonebot)可以只修改消息触发器就将本项目移植到其他

黛笙笙 16 Sep 03, 2022
Pre-1.0 door/chest sound injector for Minecraft

doorjector Pre-1.0 door/chest sound injector for Minecraft. While the game is running, doorjector hotswaps the new sounds for the old right before the

Sam 1 Nov 20, 2021
Extrator de dados do jupiterweb

Extrator de dados do jupiterweb O programa é composto de dois arquivos: Um constando apenas de classes complementares que representam as unidades e as

Bruno Aricó 2 Nov 28, 2022
tool to automate exploitation of android degubg bridge vulnerability

DISCLAIMER DISCLAIMER: ANY MALICIOUS USE OF THE CONTENTS FROM THIS ARTICLE WILL NOT HOLD THE AUTHOR RESPONSIBLE HE CONTENTS ARE SOLELY FOR EDUCATIONAL

6 Feb 12, 2022
I³ Tracker for Essential Open Innovation Datasets

I³ Tracker for Essential Open Innovation Datasets This repository is set up to track, version, and contribute updates to the I³ Essential Open Innovat

1 Feb 08, 2022
✔️ Create to-do lists to easily manage your ideas and work.

Todo List + Add task + Remove task + List completed task + List not completed task + Set clock task time + View task statistics by date Changelog v 1.

Abbas Ataei 30 Nov 28, 2022
the classic version Of torrentleechx #Unmaintained #Archived

TorrentleechX-Classic Old Modified Version Repo #Unmaintained #Archived for support join here working example group Leech Here For Any Issues/Imroveme

XcodersHub 18 Jan 30, 2022
A simple project which is a ecm to found a good way to provide a path to img_dir in gooey

ECM to find a good way for img_dir Path in Gooey This code is just an ECM to find a good way to indicate a path of image in image_dir variable. We loo

Jean-Emmanuel Longueville 1 Oct 25, 2021