PyTorch implementation of MLP-Mixer

MLP-Mixer: an all-MLP architecture composed of alternate token-mixing and channel-mixing operations.

The token-mixing is like involution in terms of channel-agnostic weights, but involution is more flexible with spatial-specific weights. This difference makes involution more friendly to transfer to downstream tasks, such as detection and segmentation.
The channel-mixing is like 1x1 convolution, permiting channel information exchange.

The combination of the above two is similar to replacing 3x3 convolution in the ResNet bottleneck block with involution, while maintaining the 1x1 convolution, giving rise to our convolution-free, attention-free architecture RedNet.

Anyway, the take-home message is common: fully-MLP based architecture could rival convolution or self-attention based architectures.

Ackowlegement

The implementation is based on the JAX/Flax code in the Appendix of the original paper.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
mixer.py		mixer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

mixer.py

mixer.py

Repository files navigation

PyTorch implementation of MLP-Mixer

Ackowlegement

About

Releases

Packages

Languages

License

d-li14/mlp-mixer.pytorch

Folders and files

Latest commit

History

Repository files navigation

PyTorch implementation of MLP-Mixer

Ackowlegement

About

Topics

Resources

License

Stars

Watchers

Forks

Languages