A real world application of a Recurrent Neural Network on a binary classification of time series data

Overview

What is this

This is a real world application of a Recurrent Neural Network on a binary classification of time series data. This project includes data cleanup, model creation, fitting, and testing/reporting and was designed and analysed in less than 24 hours.

Challenge and input

Three input files were provided for this challenge:

  • aigua.csv
  • aire.csv
  • amoni.csv (amoni_pred.csv is the same thing with integers rather than booleans)

The objective is to train a Machine Learning classifier that can predict dangerous drift on amoni.

Analysis procedure

Gretl has benn used to analyze the data.

Ideally, fuzzing techniques would be applied that would remove the input noise on amoni from the correlation with aigua.csv and aire.csv. After many hours of analysis I decided that the input files aire.csv and aigua.csv did not provide enough valuable data.

After much analysis of the amoni.csv file, I identified a technique that was able to remove most of the noise.

The technique has been implemented into the run.py file. This file cleanups up the data on amoni_pred.csv. It groups data by time intervals and gets the mean. It removes values that are too small. It clips the domain of the values. It removes noise by selecting the minimum values in a window slice. And (optionally) it corrects the dangerous drift values.

Generating the model

Once the file amoni_pred_base.csv has been created after cleaning up the input, we can move on to generating the model. Models are created and trained by the pred.py file. This file creates a Neural Network architecture with Recurrent Neural Networks (RNN). To be more precise, this NN has been tested with SimpleRNN and Long Short Term Memory (LSTM) layers. LSTM were chosed because they were seen to converge faster and provide better results and flexibility.

The input has been split on train/test sets. In order to test the network on fully unknown intervals, the test window time is non overlapping with the train window.

In order to allow prediction of a value, a window time slice is fed on to the LSTM layers. This window only includes past values and does not provide a lookahead cheat opportunity. The model is trained with checkpoints tracking testing accuracy. Loss and accuracy graphs are automatically generated for the training and testing sets.

Testing the models

After the models have been generated, the file test.py predicts the drift and dangerous values on the input data, It also provides accuracy metrics and saves the resulting file output.csv. This file can then be analysed with Gretl.

Performance

Our models are capable of achieving:

  • ~ 75% Accuracy on dangerous drifts with minimal time delays
  • ~ 80% Accuracy on drifts with minimal time delays

Moreover, with the set of corrections of the dangerous drift input values explained in previous sections, our model can achieve:

  • ~ 87% Accuracy on dangerous drifts with minimal time delays

Future Work / Improvements

Many improvements are possible on this architecture. First of all, fine tuning of the hyper parameters (clean up data set values, NN depth, type of layers, etc) should all be considered. Furthermore, more data should be collected, because the current data set only provides information for ~ 8 drifts. On top of that, more advanced noise analysis techniques should be applied, like fuzzing, exponential smoothing etc.

Other possible techniques

Yes, Isolation Forests are probably a better idea. But LSTM layers are cool :)

Show me some pictures

In blue, expected dangerous drift predictions. In orange the prediction by the presented model.

Screenshot1

Furthermore, with the patched dangerous drift patch:

Screenshot2

Owner
Josep Maria Salvia Hornos
Studying Business Management & Computer Science :D
Josep Maria Salvia Hornos
A pre-trained model with multi-exit transformer architecture.

ElasticBERT This repository contains finetuning code and checkpoints for ElasticBERT. Towards Efficient NLP: A Standard Evaluation and A Strong Baseli

fastNLP 48 Dec 14, 2022
This is a repository for a semantic segmentation inference API using the OpenVINO toolkit

BMW-IntelOpenVINO-Segmentation-Inference-API This is a repository for a semantic segmentation inference API using the OpenVINO toolkit. It's supported

BMW TechOffice MUNICH 34 Nov 24, 2022
Code for the ICCV'21 paper "Context-aware Scene Graph Generation with Seq2Seq Transformers"

ICCV'21 Context-aware Scene Graph Generation with Seq2Seq Transformers Authors: Yichao Lu*, Himanshu Rai*, Cheng Chang*, Boris Knyazev†, Guangwei Yu,

Layer6 Labs 37 Dec 18, 2022
LIMEcraft: Handcrafted superpixel selectionand inspection for Visual eXplanations

LIMEcraft LIMEcraft: Handcrafted superpixel selectionand inspection for Visual eXplanations The LIMEcraft algorithm is an explanatory method based on

MI^2 DataLab 4 Aug 01, 2022
M2MRF: Many-to-Many Reassembly of Features for Tiny Lesion Segmentation in Fundus Images

M2MRF: Many-to-Many Reassembly of Features for Tiny Lesion Segmentation in Fundus Images This repo is the official implementation of paper "M2MRF: Man

12 Dec 14, 2022
A Real-ESRGAN equipped Colab notebook for CLIP Guided Diffusion

#360Diffusion automatically upscales your CLIP Guided Diffusion outputs using Real-ESRGAN. Latest Update: Alpha 1.61 [Main Branch] - 01/11/22 Layout a

78 Nov 02, 2022
DI-smartcross - Decision Intelligence Platform for Traffic Crossing Signal Control

DI-smartcross DI-smartcross - Decision Intelligence Platform for Traffic Crossin

OpenDILab 213 Jan 02, 2023
Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch

NÜWA - Pytorch (wip) Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch. This repository will be popul

Phil Wang 463 Dec 28, 2022
C3d-pytorch - Pytorch porting of C3D network, with Sports1M weights

C3D for pytorch This is a pytorch porting of the network presented in the paper Learning Spatiotemporal Features with 3D Convolutional Networks How to

Davide Abati 311 Jan 06, 2023
A multi-scale unsupervised learning for deformable image registration

A multi-scale unsupervised learning for deformable image registration Shuwei Shao, Zhongcai Pei, Weihai Chen, Wentao Zhu, Xingming Wu and Baochang Zha

ShuweiShao 2 Apr 13, 2022
HHP-Net: A light Heteroscedastic neural network for Head Pose estimation with uncertainty

HHP-Net: A light Heteroscedastic neural network for Head Pose estimation with uncertainty Giorgio Cantarini, Francesca Odone, Nicoletta Noceti, Federi

18 Aug 02, 2022
Turn based roguelike in python

pyTB Turn based roguelike in python Documentation can be found here: http://mcgillij.github.io/pyTB/index.html Screenshot Dependencies Written in Pyth

Jason McGillivray 4 Sep 29, 2022
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with ONNX, TensorRT, ncnn, and OpenVINO supported.

Introduction YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and ind

7.7k Jan 03, 2023
code release for USENIX'22 paper `On the Security Risks of AutoML`

This project is a minimized runnable project cut from trojanzoo, which contains more datasets, models, attacks and defenses. This repo will not be mai

Ren Pang 5 Apr 19, 2022
Code for SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics (ACL'2020).

SentiBERT Code for SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics (ACL'2020). https://arxiv.org/abs/20

Da Yin 66 Aug 13, 2022
TensorFlow port of PyTorch Image Models (timm) - image models with pretrained weights.

TensorFlow-Image-Models Introduction Usage Models Profiling License Introduction TensorfFlow-Image-Models (tfimm) is a collection of image models with

Martins Bruveris 227 Dec 20, 2022
DETReg: Unsupervised Pretraining with Region Priors for Object Detection

DETReg: Unsupervised Pretraining with Region Priors for Object Detection Amir Bar, Xin Wang, Vadim Kantorov, Colorado J Reed, Roei Herzig, Gal Chechik

Amir Bar 283 Dec 27, 2022
Optimal Adaptive Allocation using Deep Reinforcement Learning in a Dose-Response Study

Optimal Adaptive Allocation using Deep Reinforcement Learning in a Dose-Response Study Supplementary Materials for Kentaro Matsuura, Junya Honda, Imad

Kentaro Matsuura 4 Nov 01, 2022
AntroPy: entropy and complexity of (EEG) time-series in Python

AntroPy is a Python 3 package providing several time-efficient algorithms for computing the complexity of time-series. It can be used for example to e

Raphael Vallat 153 Dec 27, 2022
Neural style transfer in PyTorch.

style-transfer-pytorch An implementation of neural style transfer (A Neural Algorithm of Artistic Style) in PyTorch, supporting CPUs and Nvidia GPUs.

Katherine Crowson 395 Jan 06, 2023