TCube generates rich and fluent narratives that describes the characteristics, trends, and anomalies of any time-series data (domain-agnostic) using the transfer learning capabilities of PLMs.

Last update: Oct 31, 2021

Overview

TCube: Domain-Agnostic Neural Time series Narration

This repository contains the code for the paper: "TCube: Domain-Agnostic Neural Time series Narration" (to appear in IEEE ICDM 2021).

The PLMs used in this effort (T5, BART, and GPT-2) are implemented using the HuggingFace library (https://huggingface.co/) and finetuned to the WebNLG v3 (https://gitlab.com/shimorina/webnlg-dataset/-/tree/master/release_v3.0) and DART (https://arxiv.org/abs/2007.02871) datasets.

Clones of both datasets are available under /Finetune PLMs/Datasets in this repository.

The PLMs fine-tuned to WebNLG/DART could not be uploaded due to the 1GB limitations of GitLFS. However, pre-made scripts in this repository (detailed below) are present for convientiently fine-tuning these models.

The entire repository is based on Python 3.6 and the results are visaulized through the iPython Notebooks.

Dependencies

Interactive Environments

notebook
ipywidgets==7.5.1

Deep Learning Frameworks

torch 1.7.1 (suited to your CUDA version)
pytorch-lightning 0.9.0
transformers==3.1.0

NLP Toolkits

sentencepiece==0.1.91
nltk

Scientific Computing, Data Manipulation, and Visualizations

numpy
scipy
sklearn
matplotib
pandas
pwlf

Evaluation

rouge-score
textstat
lexical_diversity
language-tool-python

Misc

xlrd
tqdm
cython

Please make sure that the aforementioned Python packages with their specified versions are installed in your system in a separate virtual environment.

Data-Preprocessing Scripts

Under /Finetune PLMs in this repository there are two scripts for pre-processing the WebNLG and DART datasets:

preprocess_webnlg.py
preprocess_dart.py

These scripts draw from the original datasets in /Finetune PLMs/Datasets/WebNLGv3 and /Finetune PLMs/Datasets/DART and prepare CSV files in /Finetune PLMs/Datasets breaking the original datasets into train, dev, and test sets in the format required by our PLMs.

Fine-tuning Scripts

Under /Finetune PLMs in this repository there are three scripts for fine-tuning T5, BART, and GPT-2:

finetuneT5.py
finetuneBART.py
finetuneGPT2.py

Visualization and Evaluation Notebooks

In the root directory are 10 notebooks. For the descriptions of the time-series datasets used:

Datatsets.ipynb

For comparisons of segmentation and regime-change detection algorithms:

Error Determination.ipynb
Regime Detection.ipynb
Segmentation.ipynb
Trend Detection Plot.ipynb

For the evaluation of the TCube framework on respective time-series datasets:

T3-COVID.ipnyb
T3-DOTS.ipnyb
T3-Pollution.ipnyb
T3-Population.ipnyb
T3-Temperature.ipnyb

Citation and Contact

If any part of this code repository or the TCube framework is used in your work, please cite our paper. Thanks!

Contact: Mandar Sharma ([email protected]), First Author.

TCube generates rich and fluent narratives that describes the characteristics, trends, and anomalies of any time-series data (domain-agnostic) using the transfer learning capabilities of PLMs.

Related tags

Overview

TCube: Domain-Agnostic Neural Time series Narration

Dependencies

Interactive Environments

Deep Learning Frameworks

NLP Toolkits

Scientific Computing, Data Manipulation, and Visualizations

Evaluation

Misc

Data-Preprocessing Scripts

Fine-tuning Scripts

Visualization and Evaluation Notebooks

Citation and Contact

Owner

Mandar Sharma

Accompanying code for the paper "A Kernel Test for Causal Association via Noise Contrastive Backdoor Adjustment".

MLSpace: Hassle-free machine learning & deep learning development

3D Generative Adversarial Network

ncnn is a high-performance neural network inference framework optimized for the mobile platform

A framework for annotating 3D meshes using the predictions of a 2D semantic segmentation model.

Vrcwatch - Supply the local time to VRChat as Avatar Parameters through OSC

[NeurIPS 2021] The PyTorch implementation of paper "Self-Supervised Learning Disentangled Group Representation as Feature"

Its a Plant Leaf Disease Detection System based on Machine Learning.

Bottleneck Transformers for Visual Recognition

Rotation Robust Descriptors

Hyperparameter tuning for humans

Styled Augmented Translation

Contrastive Learning Inverts the Data Generating Process

This repository contains a pytorch implementation of "HeadNeRF: A Real-time NeRF-based Parametric Head Model (CVPR 2022)".

Disentangled Lifespan Face Synthesis

Estimation of human density in a closed space using deep learning.

This repository is the official implementation of Open Rule Induction. This paper has been accepted to NeurIPS 2021.

Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.

Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving