Simple-implementation-of-Mobile-Former

At present, only the model but no trained. There may be some bug in the code, and some details may be different from the original paper, if you are interested in this, welcome to discuss.

Add: CutUp,MixUp,RandomErasing,SyncBatchNorm for DDP train

There are tow way for qkv aline in new code，A: Split token dim into heads(N); B: Broadcast x while product(Y)

Add: Make model by config(mf52, mf294, mf508) in config.py, the number of parameters almost same with paper

Train：python main.py --name mf294 --data path/to/ImageNet --dist-url 'tcp://127.0.0.1:12345' --dist-backend 'nccl' --multiprocessing-distributed --world-size 1 --rank 0 --batch-size 256

Inference:

paper:https://arxiv.org/pdf/2108.05895.pdf

https://github.com/xiaolai-sqlai/mobilenetv3

https://github.com/lucidrains/vit-pytorch

https://github.com/Islanna/DynamicReLU

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
utils		utils
README.md		README.md
main.py		main.py
model.py		model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

utils

utils

README.md

README.md

main.py

main.py

model.py

model.py

Repository files navigation

Simple-implementation-of-Mobile-Former

About

Releases

Packages

Languages

ACheun9/Pytorch-implementation-of-Mobile-Former

Folders and files

Latest commit

History

Repository files navigation

Simple-implementation-of-Mobile-Former

About

Resources

Stars

Watchers

Forks

Languages