Implementation of ETSformer, state of the art time-series Transformer, in Pytorch

Last update: Dec 30, 2022

Overview

ETSformer - Pytorch

Implementation of ETSformer, state of the art time-series Transformer, in Pytorch

Install

$ pip install etsformer-pytorch

Usage

import torch
from etsformer_pytorch import ETSFormer

model = ETSFormer(
    time_features = 4,
    model_dim = 512,                # in paper they use 512
    embed_kernel_size = 3,          # kernel size for 1d conv for input embedding
    layers = 2,                     # number of encoder and corresponding decoder layers
    heads = 8,                      # number of exponential smoothing attention heads
    K = 4,                          # num frequencies with highest amplitude to keep (attend to)
    dropout = 0.2                   # dropout (in paper they did 0.2)
)

timeseries = torch.randn(1, 1024, 4)

pred = model(timeseries, num_steps_forecast = 32) # (1, 32, 4) - (batch, num steps forecast, num time features)

For using ETSFormer for classification, using cross attention pooling on all latents and level output

import torch
from etsformer_pytorch import ETSFormer, ClassificationWrapper

etsformer = ETSFormer(
    time_features = 1,
    model_dim = 512,
    embed_kernel_size = 3,
    layers = 2,
    heads = 8,
    K = 4,
    dropout = 0.2
)

adapter = ClassificationWrapper(
    etsformer = etsformer,
    dim_head = 32,
    heads = 16,
    dropout = 0.2,
    level_kernel_size = 5,
    num_classes = 10
)

timeseries = torch.randn(1, 1024)

logits = adapter(timeseries) # (1, 10)

Citation

@misc{woo2022etsformer,
    title   = {ETSformer: Exponential Smoothing Transformers for Time-series Forecasting}, 
    author  = {Gerald Woo and Chenghao Liu and Doyen Sahoo and Akshat Kumar and Steven Hoi},
    year    = {2022},
    eprint  = {2202.01381},
    archivePrefix = {arXiv},
    primaryClass = {cs.LG}
}

Comments

What are your thoughts on using latents for additional classification task
Hi! I was wondering if you have thought about aggregating seasonal and growth latents for additional tasks (for example classification)? What are the possible ways to bring latents into single feature vector in your opinion? The easiest one would be just get the mean along layers and time dimensions but that seams to be too naive. Another idea I had it to use Cross Attention mechanic with single time query key to aggregate latents:

all_latents = torch.cat([latent_growths, latent_seasonals], dim=-1) all_latents = rearrange(all_latents, 'b n l d -> (b l) n d') # q = nn.Parameter(torch.randn(all_latents_dim)) q = repeat(q, 'd -> b 1 d', b = all_latents.shape[0]) agg_latent = cross_attention(query=q, context=all_latents) agg_latent = rearrange(all_latents, '(b l) n d -> b (l n) d') agg_latent = agg_latent.mean(dim=1) # may be we should have done it before cross attention?

Would be great to hear your thoughts
opened by inspirit 15
Pre LayerNorm might be required for k,v?

https://github.com/lucidrains/ETSformer-pytorch/blob/2561053007e919409b3255eb1d0852c68799d24f/etsformer_pytorch/etsformer_pytorch.py#L440

In my early tests I see some instability in training results, I was wondering if it might be good idea to LayerNorm latents before constructing key and values?

opened by inspirit 5
growth_term calculation error

https://github.com/lucidrains/ETSformer-pytorch/blob/e1d8514b44d113ead523aa6307986833e68eecc5/etsformer_pytorch/etsformer_pytorch.py#L233-L235

It looks like you are not using growth and growth_smoothing_weightsto calculate growth_term

opened by inspirit 4
Backward gradient error
Hello,

i was trying to run the provided class and see following error: Function ScatterBackward0 returned an invalid gradient at index 1 - got [64, 4, 128] but expected shape compatible with [64, 33, 128]

model = ETSFormer( time_features = 9, model_dim = 128, embed_kernel_size = 3, layers = 2, heads = 4, K = 4, dropout = 0.2 )

input = torch.rand(64, 64, 9) x = model(input, num_steps_forecast = 16)
opened by inspirit 3
Does ETS-Former allow adding features

@lucidrains Thanks for making the code of the model available!

In your paper, you state that the model infers seasonal patterns itself, so that there is no need to add time features like week, month, etc.

Still, to increase the applicability of your approach, does the current implementation allow to add any (time-invariant and time-varying) features, e.g., categorical or numeric?

opened by StatMixedML 2
wrong order of arguments

https://github.com/lucidrains/ETSformer-pytorch/blob/2e0d465576c15fc8d84c4673f93fdd71d45b799c/etsformer_pytorch/etsformer_pytorch.py#L327

you pass latents on wrong order to Level module: according to forward method first should be growth and then seasonal

opened by inspirit 1
Clarification regarding data pre-processing

Hello,

I was trying to run the ETSformer for ETT dataset. The paper mentions that the dataset is split as 60/20/20 for train, validation and test. Could you give some insight as to how the dataset split is happening in the code.

Thank you.

opened by vageeshmaiya 2

Releases(0.0.16)

0.0.16(Mar 22, 2022)

Source code(tar.gz)
Source code(zip)
0.0.15(Mar 22, 2022)

Source code(tar.gz)
Source code(zip)
0.0.14a(Mar 22, 2022)

Source code(tar.gz)
Source code(zip)
0.0.12(Mar 20, 2022)

Source code(tar.gz)
Source code(zip)
0.0.11(Mar 20, 2022)

Source code(tar.gz)
Source code(zip)
0.0.10(Mar 20, 2022)

Source code(tar.gz)
Source code(zip)
0.0.9(Mar 20, 2022)

Source code(tar.gz)
Source code(zip)
0.0.8(Mar 20, 2022)

Source code(tar.gz)
Source code(zip)
0.0.7(Mar 19, 2022)

Source code(tar.gz)
Source code(zip)
0.0.6(Mar 18, 2022)

Source code(tar.gz)
Source code(zip)
0.0.5(Mar 17, 2022)

Source code(tar.gz)
Source code(zip)
0.0.4(Mar 17, 2022)

Source code(tar.gz)
Source code(zip)
0.0.3a(Mar 16, 2022)

Source code(tar.gz)
Source code(zip)
0.0.1(Mar 15, 2022)

Source code(tar.gz)
Source code(zip)

Owner

Phil Wang

Working with Attention. It's all we need

GitHub Repository

Density-aware Single Image De-raining using a Multi-stream Dense Network (CVPR 2018)

DID-MDN Density-aware Single Image De-raining using a Multi-stream Dense Network He Zhang, Vishal M. Patel [Paper Link] (CVPR'18) We present a novel d

224 Dec 12, 2022

Pytorch implementation of FlowNet by Dosovitskiy et al.

FlowNetPytorch Pytorch implementation of FlowNet by Dosovitskiy et al. This repository is a torch implementation of FlowNet, by Alexey Dosovitskiy et

762 Jan 02, 2023

Official Implement of CVPR 2021 paper “Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting”

RGBT Crowd Counting Lingbo Liu, Jiaqi Chen, Hefeng Wu, Guanbin Li, Chenglong Li, Liang Lin. "Cross-Modal Collaborative Representation Learning and a L

37 Dec 08, 2022

TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-Captured Scenarios

TPH-YOLOv5 This repo is the implementation of "TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-Captured

439 Dec 22, 2022

ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhin et al., 2020).

ReConsider ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhin

47 Jul 26, 2022

Implementation of ETSformer, state of the art time-series Transformer, in Pytorch

Related tags

Overview

ETSformer - Pytorch

Install

Usage

Citation

Comments

What are your thoughts on using latents for additional classification task

Pre LayerNorm might be required for k,v?

growth_term calculation error

Backward gradient error

Does ETS-Former allow adding features

wrong order of arguments

Clarification regarding data pre-processing

Releases(0.0.16)

0.0.16(Mar 22, 2022)

0.0.15(Mar 22, 2022)

0.0.14a(Mar 22, 2022)

0.0.12(Mar 20, 2022)

0.0.11(Mar 20, 2022)

0.0.10(Mar 20, 2022)

0.0.9(Mar 20, 2022)

0.0.8(Mar 20, 2022)

0.0.7(Mar 19, 2022)

0.0.6(Mar 18, 2022)

0.0.5(Mar 17, 2022)

0.0.4(Mar 17, 2022)

0.0.3a(Mar 16, 2022)

0.0.1(Mar 15, 2022)

Owner

Phil Wang

Density-aware Single Image De-raining using a Multi-stream Dense Network (CVPR 2018)

Pytorch implementation of FlowNet by Dosovitskiy et al.

Official Implement of CVPR 2021 paper “Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting”

TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-Captured Scenarios

ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhin et al., 2020).

ReAct: Out-of-distribution Detection With Rectified Activations

TDN: Temporal Difference Networks for Efficient Action Recognition

PyoMyo - Python Opensource Myo library

Good Semi-Supervised Learning That Requires a Bad GAN

Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

Code for the paper "Adversarially Regularized Autoencoders (ICML 2018)" by Zhao, Kim, Zhang, Rush and LeCun

[ICML 2021] Break-It-Fix-It: Learning to Repair Programs from Unlabeled Data

Scripts and a shader to get you started on setting up an exported Koikatsu character in Blender.

Hyperbolic Image Segmentation, CVPR 2022

Official implementation of "A Unified Objective for Novel Class Discovery", ICCV2021 (Oral)

Source code for "Understanding Knowledge Integration in Language Models with Graph Convolutions"

Code of the paper "Shaping Visual Representations with Attributes for Few-Shot Learning (ASL)".

这是一个yolox-keras的源码，可以用于训练自己的模型。

Quantum-enhanced transformer neural network

Jingju baseline - A baseline model of our project of Beijing opera script generation