A universal framework for learning timestamp-level representations of time series

Last update: Dec 30, 2022

Related tags

Overview

TS2Vec

This repository contains the official implementation for the paper Learning Timestamp-Level Representations for Time Series with Hierarchical Contrastive Loss.

Requirements

The recommended requirements for TS2Vec are specified as follows:

Python 3.8
scipy==1.6.1
torch==1.8.1
numpy==1.19.2
pandas==1.0.1
scikit_learn==0.24.1

The dependencies can be installed by:

pip install -r requirements.txt

Data

The datasets can be obtained and put into datasets/ folder in the following way:

128 UCR datasets should be put into datasets/UCR/ so that each data file can be located by datasets/UCR/<dataset_name>/<dataset_name>_*.csv.
30 UEA datasets should be put into datasets/UEA/ so that each data file can be located by datasets/UEA/<dataset_name>/<dataset_name>_*.arff.
3 ETT datasets should be placed at datasets/ETTh1.csv, datasets/ETTh2.csv and datasets/ETTm1.csv.
Electricity dataset should be resampled into hourly data of 321 clients over the last 3 years and placed at datasets/electricity.csv.

Usage

To train and evaluate TS2Vec on a dataset, run the following command:

python train.py <dataset_name> <run_name> --archive <archive> --batch-size <batch_size> --repr-dims <repr_dims> --gpu <gpu> --eval

The detailed descriptions about the arguments are as following:

Parameter name	Description of parameter
dataset_name	The dataset name
run_name	The folder name used to save model, output and evaluation metrics. This can be set to any word
archive	The archive name that the dataset belongs to. This can be set to `UCR`, `UEA`, `forecast_csv` or `forecast_csv_univar`
batch_size	The batch size (defaults to 8)
repr_dims	The representation dimensions (defaults to 320)
gpu	The gpu no. used for training and inference (defaults to 0)
eval	Whether to perform evaluation after training

(For descriptions of more arguments, run python train.py -h.)

After training and evaluation, the trained encoder, output and evaluation metrics can be found in training/DatasetName__RunName_Date_Time/.

Scripts: The scripts for reproduction are provided in scripts/ folder.

Code Example

from ts2vec import TS2Vec
import datautils

# Load the ECG200 dataset from UCR archive
train_data, train_labels, test_data, test_labels = datautils.load_UCR('ECG200')
# (Both train_data and test_data have a shape of n_instances x n_timestamps x n_features)

# Train a TS2Vec model
model = TS2Vec(
    input_dims=1,
    device=0,
    output_dims=320
)
loss_log = model.fit(
    train_data,
    verbose=True
)

# Compute timestamp-level representations for test set
test_repr = model.encode(test_data)  # n_instances x n_timestamps x output_dims

# Compute instance-level representations for test set
test_repr = model.encode(test_data, encoding_window='full_series')  # n_instances x output_dims

# Sliding inference for test set
test_repr = model.encode(
    test_data,
    casual=True,
    sliding_length=1,
    sliding_padding=50
)  # n_instances x n_timestamps x output_dims
# (The timestamp t's representation vector is computed using the observations located in [t-50+1, t])

A universal framework for learning timestamp-level representations of time series

Related tags

Overview

TS2Vec

Requirements

Data

Usage

Code Example

Owner

Zhihan Yue

Multi-Horizon-Forecasting-for-Limit-Order-Books

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Efficient Two-Step Networks for Temporal Action Segmentation (Neurocomputing 2021)

High-Resolution Image Synthesis with Latent Diffusion Models

Microscopy Image Cytometry Toolkit

Adabelief-Optimizer - Repository for NeurIPS 2020 Spotlight "AdaBelief Optimizer: Adapting stepsizes by the belief in observed gradients"

Using knowledge-informed machine learning on the PRONOSTIA (FEMTO) and IMS bearing data sets. Predict remaining-useful-life (RUL).

Code release for DS-NeRF (Depth-supervised Neural Radiance Fields)

Supplemental Code for "ImpressionNet :A Multi view Approach to Predict Socio Facial Impressions"

Improving Query Representations for DenseRetrieval with Pseudo Relevance Feedback:A Reproducibility Study.

ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning. In ICCV, 2021.

Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

Can we visualize a large scientific data set with a surrogate model? We're building a GAN for the Earth's Mantle Convection data set to see if we can!

Code for "Infinitely Deep Bayesian Neural Networks with Stochastic Differential Equations"

Config files for my GitHub profile.

Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"

Vis2Mesh: Efficient Mesh Reconstruction from Unstructured Point Clouds of Large Scenes with Learned Virtual View Visibility ICCV2021

Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.

Satellite labelling tool for manual labelling of storm top features such as overshooting tops, above-anvil plumes, cold U/Vs, rings etc.

Causal estimators for use with WhyNot