Norm-based Analysis of Transformer

Implementations for 3 papers introducing to analyze Transformers using vector norms:

Kobayashi et al., Attention is Not Only a Weight: Analyzing Transformers with Vector Norms (EMNLP 2020)
Kobayashi et al., Incorporating Residual and Normalization Layers into Analysis of Masked Language Models (EMNLP 2021)
Kobayashi et al., Analyzing Feed-Forward Blocks in Transformers through the Lens of Attention Maps (ICLR 2024 Spotlight)

Kobayashi et al., Attention is Not Only a Weight: Analyzing Transformers with Vector Norms (EMNLP 2020)

This paper proposed to analyze attention, a core component of Transformer, using vector norms rather than attention weights.
Transformer analyses have been focused on mixing in attention and have typically observed attention weights.
However, in addition to attention weights, there are more factors to determine attention's outputs: the input vector itself and vector transformations.
Then, this paper proposed to analyze attention using vector norms considering them.
→ Check this paper's code: Code for emnlp2020.

Kobayashi et al., Incorporating Residual and Normalization Layers into Analysis of Masked Language Models (EMNLP 2021)

This paper proposed to analyze attention block (i.e., attention, residual connection, and layer normalization) using vector norms.
Transformer analyses have been focused on mixing in attention.
However, there are components other than attention in Transformer, and they can play a role other than mixing.
Then, this paper proposed to expand the scope of Transformer analysis from attention into attention block.
→ Check this paper's code: Code for emnlp2021.

Kobayashi et al., Analyzing Feed-Forward Blocks in Transformers through the Lens of Attention Maps (ICLR 2024 Spotlight)

The paper's implementation will be public soon!

Citation

If you use our code for academic work, please cite:

@inproceedings{kobayashi-etal-2020-attention,  
   title = {Attention is Not Only a Weight: Analyzing Transformers with Vector Norms},  
   author = {Kobayashi, Goro and Kuribayashi, Tatsuki and Yokoi, Sho and Inui, Kentaro},  
   booktitle = {Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)},  
   year = "2020",  
   url = "https://www.aclweb.org/anthology/2020.emnlp-main.574",  
   pages = "7057--7075",  
}
@inproceedings{kobayashi-etal-2021-incorporating,
   title = {Incorporating Residual and Normalization Layers into Analysis of Masked Language Models},
   author = {Kobayashi, Goro and Kuribayashi, Tatsuki and Yokoi, Sho and Inui, Kentaro},
   booktitle = {Proceedings of the 2021 Conference on Empirical Methods in Natural Language Proceeding (EMNLP)},
   year = "2021",
   url = "https://arxiv.org/abs/2109.07152",
   pages = "to appear",
}
@inproceedings{kobayashi2024analyzing,
   title={Analyzing Feed-Forward Blocks in Transformers through the Lens of Attention Map},
   author={Goro Kobayashi and Tatsuki Kuribayashi and Sho Yokoi and Kentaro Inui},
   booktitle={The Twelfth International Conference on Learning Representations},
   year={2024},
   url={https://openreview.net/forum?id=mYWsyTuiRp}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
emnlp2020		emnlp2020
emnlp2021		emnlp2021
iclr2024		iclr2024
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

emnlp2020

emnlp2020

emnlp2021

emnlp2021

iclr2024

iclr2024

.gitignore

.gitignore

README.md

README.md

Repository files navigation

Norm-based Analysis of Transformer

Kobayashi et al., Attention is Not Only a Weight: Analyzing Transformers with Vector Norms (EMNLP 2020)

Kobayashi et al., Incorporating Residual and Normalization Layers into Analysis of Masked Language Models (EMNLP 2021)

Kobayashi et al., Analyzing Feed-Forward Blocks in Transformers through the Lens of Attention Maps (ICLR 2024 Spotlight)

Citation

About

Releases

Packages

Languages

gorokoba560/norm-analysis-of-transformer

Folders and files

Latest commit

History

Repository files navigation

Norm-based Analysis of Transformer

Citation

About

Resources

Stars

Watchers

Forks

Languages