Pretrained Pytorch face detection (MTCNN) and recognition (InceptionResnet) models

Overview

Foo

Face Recognition Using Pytorch

Downloads

Code Coverage

Python 3.7 3.6 3.5
Status Build Status Build Status Build Status

xscode

This is a repository for Inception Resnet (V1) models in pytorch, pretrained on VGGFace2 and CASIA-Webface.

Pytorch model weights were initialized using parameters ported from David Sandberg's tensorflow facenet repo.

Also included in this repo is an efficient pytorch implementation of MTCNN for face detection prior to inference. These models are also pretrained. To our knowledge, this is the fastest MTCNN implementation available.

Table of contents

Quick start

  1. Install:

    # With pip:
    pip install facenet-pytorch
    
    # or clone this repo, removing the '-' to allow python imports:
    git clone https://github.com/timesler/facenet-pytorch.git facenet_pytorch
    
    # or use a docker container (see https://github.com/timesler/docker-jupyter-dl-gpu):
    docker run -it --rm timesler/jupyter-dl-gpu pip install facenet-pytorch && ipython
  2. In python, import facenet-pytorch and instantiate models:

    from facenet_pytorch import MTCNN, InceptionResnetV1
    
    # If required, create a face detection pipeline using MTCNN:
    mtcnn = MTCNN(image_size=<image_size>, margin=<margin>)
    
    # Create an inception resnet (in eval mode):
    resnet = InceptionResnetV1(pretrained='vggface2').eval()
  3. Process an image:

    from PIL import Image
    
    img = Image.open(<image path>)
    
    # Get cropped and prewhitened image tensor
    img_cropped = mtcnn(img, save_path=<optional save path>)
    
    # Calculate embedding (unsqueeze to add batch dimension)
    img_embedding = resnet(img_cropped.unsqueeze(0))
    
    # Or, if using for VGGFace2 classification
    resnet.classify = True
    img_probs = resnet(img_cropped.unsqueeze(0))

See help(MTCNN) and help(InceptionResnetV1) for usage and implementation details.

Pretrained models

See: models/inception_resnet_v1.py

The following models have been ported to pytorch (with links to download pytorch state_dict's):

Model name LFW accuracy (as listed here) Training dataset
20180408-102900 (111MB) 0.9905 CASIA-Webface
20180402-114759 (107MB) 0.9965 VGGFace2

There is no need to manually download the pretrained state_dict's; they are downloaded automatically on model instantiation and cached for future use in the torch cache. To use an Inception Resnet (V1) model for facial recognition/identification in pytorch, use:

from facenet_pytorch import InceptionResnetV1

# For a model pretrained on VGGFace2
model = InceptionResnetV1(pretrained='vggface2').eval()

# For a model pretrained on CASIA-Webface
model = InceptionResnetV1(pretrained='casia-webface').eval()

# For an untrained model with 100 classes
model = InceptionResnetV1(num_classes=100).eval()

# For an untrained 1001-class classifier
model = InceptionResnetV1(classify=True, num_classes=1001).eval()

Both pretrained models were trained on 160x160 px images, so will perform best if applied to images resized to this shape. For best results, images should also be cropped to the face using MTCNN (see below).

By default, the above models will return 512-dimensional embeddings of images. To enable classification instead, either pass classify=True to the model constructor, or you can set the object attribute afterwards with model.classify = True. For VGGFace2, the pretrained model will output logit vectors of length 8631, and for CASIA-Webface logit vectors of length 10575.

Example notebooks

Complete detection and recognition pipeline

Face recognition can be easily applied to raw images by first detecting faces using MTCNN before calculating embedding or probabilities using an Inception Resnet model. The example code at examples/infer.ipynb provides a complete example pipeline utilizing datasets, dataloaders, and optional GPU processing.

Face tracking in video streams

MTCNN can be used to build a face tracking system (using the MTCNN.detect() method). A full face tracking example can be found at examples/face_tracking.ipynb.

Finetuning pretrained models with new data

In most situations, the best way to implement face recognition is to use the pretrained models directly, with either a clustering algorithm or a simple distance metrics to determine the identity of a face. However, if finetuning is required (i.e., if you want to select identity based on the model's output logits), an example can be found at examples/finetune.ipynb.

Guide to MTCNN in facenet-pytorch

This guide demonstrates the functionality of the MTCNN module. Topics covered are:

  • Basic usage
  • Image normalization
  • Face margins
  • Multiple faces in a single image
  • Batched detection
  • Bounding boxes and facial landmarks
  • Saving face datasets

See the notebook on kaggle.

Performance comparison of face detection packages

This notebook demonstrates the use of three face detection packages:

  1. facenet-pytorch
  2. mtcnn
  3. dlib

Each package is tested for its speed in detecting the faces in a set of 300 images (all frames from one video), with GPU support enabled. Performance is based on Kaggle's P100 notebook kernel. Results are summarized below.

Package FPS (1080x1920) FPS (720x1280) FPS (540x960)
facenet-pytorch 12.97 20.32 25.50
facenet-pytorch (non-batched) 9.75 14.81 19.68
dlib 3.80 8.39 14.53
mtcnn 3.04 5.70 8.23

See the notebook on kaggle.

The FastMTCNN algorithm

This algorithm demonstrates how to achieve extremely efficient face detection specifically in videos, by taking advantage of similarities between adjacent frames.

See the notebook on kaggle.

Running with docker

The package and any of the example notebooks can be run with docker (or nvidia-docker) using:

docker run --rm -p 8888:8888
    -v ./facenet-pytorch:/home/jovyan timesler/jupyter-dl-gpu \
    -v <path to data>:/home/jovyan/data
    pip install facenet-pytorch && jupyter lab 

Navigate to the examples/ directory and run any of the ipython notebooks.

See timesler/jupyter-dl-gpu for docker container details.

Use this repo in your own git project

To use this code in your own git repo, I recommend first adding this repo as a submodule. Note that the dash ('-') in the repo name should be removed when cloning as a submodule as it will break python when importing:

git submodule add https://github.com/timesler/facenet-pytorch.git facenet_pytorch

Alternatively, the code can be installed as a package using pip:

pip install facenet-pytorch

Conversion of parameters from Tensorflow to Pytorch

See: models/utils/tensorflow2pytorch.py

Note that this functionality is not needed to use the models in this repo, which depend only on the saved pytorch state_dict's.

Following instantiation of the pytorch model, each layer's weights were loaded from equivalent layers in the pretrained tensorflow models from davidsandberg/facenet.

The equivalence of the outputs from the original tensorflow models and the pytorch-ported models have been tested and are identical:


>>> compare_model_outputs(mdl, sess, torch.randn(5, 160, 160, 3).detach())

Passing test data through TF model

tensor([[-0.0142,  0.0615,  0.0057,  ...,  0.0497,  0.0375, -0.0838],
        [-0.0139,  0.0611,  0.0054,  ...,  0.0472,  0.0343, -0.0850],
        [-0.0238,  0.0619,  0.0124,  ...,  0.0598,  0.0334, -0.0852],
        [-0.0089,  0.0548,  0.0032,  ...,  0.0506,  0.0337, -0.0881],
        [-0.0173,  0.0630, -0.0042,  ...,  0.0487,  0.0295, -0.0791]])

Passing test data through PT model

tensor([[-0.0142,  0.0615,  0.0057,  ...,  0.0497,  0.0375, -0.0838],
        [-0.0139,  0.0611,  0.0054,  ...,  0.0472,  0.0343, -0.0850],
        [-0.0238,  0.0619,  0.0124,  ...,  0.0598,  0.0334, -0.0852],
        [-0.0089,  0.0548,  0.0032,  ...,  0.0506,  0.0337, -0.0881],
        [-0.0173,  0.0630, -0.0042,  ...,  0.0487,  0.0295, -0.0791]],
       grad_fn=<DivBackward0>)

Distance 1.2874517096861382e-06

In order to re-run the conversion of tensorflow parameters into the pytorch model, ensure you clone this repo with submodules, as the davidsandberg/facenet repo is included as a submodule and parts of it are required for the conversion.

References

  1. David Sandberg's facenet repo: https://github.com/davidsandberg/facenet

  2. F. Schroff, D. Kalenichenko, J. Philbin. FaceNet: A Unified Embedding for Face Recognition and Clustering, arXiv:1503.03832, 2015. PDF

  3. Q. Cao, L. Shen, W. Xie, O. M. Parkhi, A. Zisserman. VGGFace2: A dataset for recognising face across pose and age, International Conference on Automatic Face and Gesture Recognition, 2018. PDF

  4. D. Yi, Z. Lei, S. Liao and S. Z. Li. CASIAWebface: Learning Face Representation from Scratch, arXiv:1411.7923, 2014. PDF

  5. K. Zhang, Z. Zhang, Z. Li and Y. Qiao. Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks, IEEE Signal Processing Letters, 2016. PDF

Comments
  • memory leak

    memory leak

    Hello, Im facing a memory leak and I can't find out why. I am simply looping through a lot of images and it gradually fills all my memory, this is my setup: Version facenet-pytorch==2.0.1

    mtcnn = MTCNN(image_size=64, keep_all=True)
    resnet = InceptionResnetV1(pretrained='vggface2').eval()
    
    for nth, img_path in enumerate(img_paths):
        img = Image.open(img_path.resolve())
        boxes, probs = mtcnn.detect(img)
    
    opened by marisancans 17
  • Loading TorchScript Module : class method not recognized during compilation

    Loading TorchScript Module : class method not recognized during compilation

    I know it's not strictly related to facenet-pytorch library, but I do hope that may be you or others can give me an help.

    My objective is to use the following class which simply uses your excellent work:

    from facenet_pytorch import MTCNN, InceptionResnetV1
    import torch
    from torch.utils.data import DataLoader
    from torchvision import datasets
    import numpy as np
    import pandas as pd
    import os
    
    workers = 0 if os.name == 'nt' else 4
    
    def collate_fn(x):
        return x[0]
    
    def describe(x):
        print("Type: {}".format(x.type()))
        print("Shape/size: {}".format(x.shape))
        print("Values: \n{}".format(x))
    
    class GetFaceEmbedding(torch.nn.Module):
        def __init__(self):
            super(GetFaceEmbedding, self).__init__()
    
        @classmethod
        def getFaceEmbedding(self, imagePath):
    
            device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
            print('Running on device: {}'.format(device))
    
            mtcnn = MTCNN(
                image_size=160, margin=0, min_face_size=20,
                thresholds=[0.6, 0.7, 0.7], factor=0.709, post_process=True,
                device=device
            )
            resnet = InceptionResnetV1(pretrained='vggface2').eval().to(device)
            dataset = datasets.ImageFolder(imagePath)
            dataset.idx_to_class = {i:c for c, i in dataset.class_to_idx.items()}
            loader = DataLoader(dataset, collate_fn=collate_fn, num_workers=workers)
            aligned = []
            names = []
            for x, y in loader:
                x_aligned, prob = mtcnn(x, return_prob=True)
                if x_aligned is not None:
                    print('Face detected with probability: {:8f}'.format(prob))
                    aligned.append(x_aligned)
                    names.append(dataset.idx_to_class[y])
            aligned = torch.stack(aligned).to(device)
            embeddings = resnet(aligned).detach().cpu()
            return embeddings
    

    With python it works fine:

    (venv373) (base) [email protected]:~/PyTorchMatters/facenet_pytorch/examples$ python3 .  
    /getFaceEmbedding-01.py 
    Running on device: cpu
    Face detected with probability: 0.999430
    Type: torch.FloatTensor
    Shape/size: torch.Size([1, 512])
    Values: 
    tensor([[ 3.6307e-02, -8.8092e-02, -3.5002e-02, -8.2932e-02,  1.9032e-02,
              2.3228e-02,  2.4253e-02, -3.7844e-02, -6.8906e-02,  2.0351e-02,
             -6.7093e-02,  3.6181e-02, -2.5933e-02, -6.0015e-02,  2.6653e-02,
              9.4335e-02, -2.9241e-02, -2.8357e-02,  7.2207e-02, -3.7747e-02,
              6.3515e-03, -3.0220e-02, -2.4530e-02,  1.0004e-01,  6.6520e-02,
              ....
              3.2497e-02,  2.3421e-02, -5.3921e-02,  1.9589e-02, -2.8655e-03,
              1.3474e-02, -2.2743e-02,  3.2976e-02, -5.6658e-02,  2.0837e-02,
            -4.7152e-02, -6.5534e-02]])
    

    Following the indications found here: https://pytorch.org/tutorials/advanced/cpp_export.html

    I added to getFaceEmbedding.py these lines:

    my_module = GetFaceEmbedding()
    sm = torch.jit.script(my_module)
    sm.save("annotated_get_face_embedding.pt")
    

    I then saved the serialized file:

    (venv373) (base) [email protected]:~/PyTorchMatters/facenet_pytorch/examples$ python3 
    ./getFaceEmbedding.py
    
    -rw-r--r-- 1 marco marco 1,4K mar 19 18:52 annotated_get_face_embedding.pt
    

    And created this cpp file:

    (venv373) (base) [email protected]:~/PyTorchMatters/facenet_pytorch/examples$ nano 
    faceEmbedding.cpp
    
    #include <torch/script.h>
    #include <iostream>
    #include <memory>
    #include <filesystem>
    
    int main(int argc, const char* argv[]) {
        //if(argc != 3) {
        //    std::cerr << "usage:usage: faceEmbedding <path-to-exported-script-module> <path-to-
       // image-file> \n";
        //    return -1;
        //}
    
      torch::jit::script::Module module;
      try {
          // Deserialize the ScriptModule from a file using torch::jit::load().
          module = torch::jit::load(argv[1]);
          std::filesystem::path imgPath = argv[2];
    
          // Execute the model and turn its output into a tensor
          at::Tensor output = module.getFaceEmbedding(imgPath).ToTensor();
      }
      catch (const c10::Error& e) {
          std::cerr << "error loading the model\n";
          return -1;
      }
      std::cout << "ok\n";
    } // end of main() function
    

    But during the compilation phase I get this error :

    (venv373) (base) [email protected]:~/PyTorchMatters/facenet_pytorch/examples$ mkdir build
    (venv373) (base) [email protected]:~/PyTorchMatters/facenet_pytorch/examples$ cd build
    (venv373) (base) [email protected]:~/PyTorchMatters/facenet_pytorch/examples/build$ cmake 
    -DCMAKE_PREFIX_PATH=/home/marco/PyTorchMatters/libtorch ..
    -- The C compiler identification is GNU 9.2.1
    -- The CXX compiler identification is GNU 9.2.1
    -- Check for working C compiler: /usr/bin/cc
    -- Check for working C compiler: /usr/bin/cc -- works
    -- Detecting C compiler ABI info
    -- Detecting C compiler ABI info - done
    -- Detecting C compile features
    -- Detecting C compile features - done
    -- Check for working CXX compiler: /usr/bin/c++
    -- Check for working CXX compiler: /usr/bin/c++ -- works
    -- Detecting CXX compiler ABI info
    -- Detecting CXX compiler ABI info - done
    -- Detecting CXX compile features
    -- Detecting CXX compile features - done
    -- Looking for pthread.h
    -- Looking for pthread.h - found
    -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
    -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
    -- Looking for pthread_create in pthreads
    -- Looking for pthread_create in pthreads - not found
    -- Looking for pthread_create in pthread
    -- Looking for pthread_create in pthread - found
    -- Found Threads: TRUE  
    -- Found torch: /home/marco/PyTorchMatters/libtorch/lib/libtorch.so  
    -- Configuring done
    -- Generating done
    -- Build files have been written to: /home/marco/PyTorchMatters/facenet_pytorch/examples/build
    
    (venv373) (base) [email protected]:~/PyTorchMatters/facenet_pytorch/examples/build$ cmake --build 
    . 
    --config Release
    Scanning dependencies of target faceEmbedding
    [ 50%] Building CXX object CMakeFiles/faceEmbedding.dir/faceEmbedding.cpp.o
        /home/marco/PyTorchMatters/facenet_pytorch/examples/faceEmbedding.cpp: In function ‘int 
    main(int, const char**)’:
    /home/marco/PyTorchMatters/facenet_pytorch/examples/faceEmbedding.cpp:20:34: error: ‘struct 
    torch::jit::script::Module’ has no member named ‘getFaceEmbedding’
       20 |       at::Tensor output = module.getFaceEmbedding(imgPath).ToTensor();
          |                                  ^~~~~~~~~~~~~~~~
    CMakeFiles/faceEmbedding.dir/build.make:62: recipe for target 'CMakeFiles/faceEmbedding.dir
    /faceEmbedding.cpp.o' failed
    make[2]: *** [CMakeFiles/faceEmbedding.dir/faceEmbedding.cpp.o] Error 1
    CMakeFiles/Makefile2:75: recipe for target 'CMakeFiles/faceEmbedding.dir/all' failed
    make[1]: *** [CMakeFiles/faceEmbedding.dir/all] Error 2
    Makefile:83: recipe for target 'all' failed
    make: *** [all] Error 2
    

    How to solve the problem? Looking forward to your kind help. Marco

    opened by marcoippolito 12
  • how to setup the pretrained model locally?

    how to setup the pretrained model locally?

    since the internet is too slow to download and frequently disconnected.

    I could get the pretained model files some other way.

    so how should I set it up locally?

    Thank you!

    opened by flydragon2018 7
  • how to upload my own dataset ?

    how to upload my own dataset ?

    thanks for this repo , i've a problem , how to upload my own dataset for resnet? does inception resnet with resnet are the same thing ?! im new to nn thanks for your advice ..

    opened by jordan-bys 6
  • VGGFace2 fine-tune: Poor training accuracy on CPU

    VGGFace2 fine-tune: Poor training accuracy on CPU

    I'm using this code to fine-tune the model but I'm getting very poor training accuracy, could you please point out what could be the issue:

    # define mtcnn
    mtcnn = MTCNN(
        image_size=160, margin=0, min_face_size=20,
        thresholds=[0.6, 0.7, 0.7], factor=0.709, post_process=True,
        device=device
    )
    
    # load data
    dataset = datasets.ImageFolder(data_dir, transform=transforms.Resize((512, 512)))
    
    dataset.samples = [
        (p, p.replace(data_dir, data_dir + '_cropped'))
            for p, _ in dataset.samples
    ]
    
    loader = DataLoader(
        dataset,
        num_workers=workers,
        batch_size=batch_size,
        collate_fn=training.collate_pil
    )
    
    print('applying mtcnn...')
    
    # apply mtcnn
    for i, (x, y) in enumerate(loader):
        mtcnn(x, save_path=y)
        print('\rBatch {} of {}'.format(i + 1, len(loader)), end='')
        
    # Remove mtcnn to reduce GPU memory usage
    del mtcnn
    
    print('')
    print('starting training...')
    
    print(len(dataset.class_to_idx))
    
    resnet = InceptionResnetV1(
        classify=False,
        pretrained='vggface2',
        num_classes=len(dataset.class_to_idx)
    ).to(device)
    
    optimizer = optim.Adam(resnet.parameters(), lr=0.001)
    scheduler = MultiStepLR(optimizer, [5, 10])
    
    trans = transforms.Compose([
        np.float32,
        transforms.ToTensor(),
        fixed_image_standardization
    ])
    
    dataset = datasets.ImageFolder(data_dir + '_cropped', transform=trans)
    img_inds = np.arange(len(dataset))
    np.random.shuffle(img_inds)
    train_inds = img_inds[:int(0.8 * len(img_inds))]
    val_inds = img_inds[int(0.8 * len(img_inds)):]
    
    train_loader = DataLoader(
        dataset,
        num_workers=workers,
        batch_size=batch_size,
        sampler=SubsetRandomSampler(train_inds)
    )
    
    val_loader = DataLoader(
        dataset,
        num_workers=workers,
        batch_size=batch_size,
        sampler=SubsetRandomSampler(val_inds)
    )
    
    loss_fn = torch.nn.CrossEntropyLoss()
    metrics = {
        'fps': training.BatchTimer(),
        'acc': training.accuracy
    }
    
    writer = SummaryWriter()
    writer.iteration, writer.interval = 0, 10
    
    print('\n\nInitial')
    print('-' * 10)
    resnet.eval()
    training.pass_epoch(
        resnet, loss_fn, val_loader,
        batch_metrics=metrics, show_running=True, device=device,
        writer=writer
    )
    
    for epoch in range(epochs):
        print('\nEpoch {}/{}'.format(epoch + 1, epochs))
        print('-' * 10)
    
        resnet.train()
        training.pass_epoch(
            resnet, loss_fn, train_loader, optimizer, scheduler,
            batch_metrics=metrics, show_running=True, device=device,
            writer=writer
        )
    
        resnet.eval()
        training.pass_epoch(
            resnet, loss_fn, val_loader,
            batch_metrics=metrics, show_running=True, device=device,
            writer=writer
        )
    
    writer.close()
    
    # save trained model
    torch.save(resnet.state_dict(), trained_model)
    

    and this is the accuracy I'm getting after 8 epochs:

    Initial

    Valid | 1/1 | loss: 6.2234 | fps: 1.4201 | acc: 0.0000

    Epoch 1/8

    Train | 1/1 | loss: 6.2504 | fps: 0.8825 | acc: 0.0000
    Valid | 1/1 | loss: 6.2631 | fps: 1.5254 | acc: 0.0000

    Epoch 2/8

    Train | 1/1 | loss: 6.1771 | fps: 1.3746 | acc: 0.0000
    Valid | 1/1 | loss: 6.2687 | fps: 1.3014 | acc: 0.0000

    Epoch 3/8

    Train | 1/1 | loss: 6.1627 | fps: 1.0087 | acc: 0.2500
    Valid | 1/1 | loss: 6.2471 | fps: 1.5138 | acc: 0.0000

    Epoch 4/8

    Train | 1/1 | loss: 6.1570 | fps: 1.0297 | acc: 0.2500
    Valid | 1/1 | loss: 6.2371 | fps: 1.8226 | acc: 0.0000

    Epoch 5/8

    Train | 1/1 | loss: 6.1445 | fps: 0.8727 | acc: 0.5000
    Valid | 1/1 | loss: 6.2335 | fps: 0.9244 | acc: 0.0000

    Epoch 6/8

    Train | 1/1 | loss: 6.1274 | fps: 1.1550 | acc: 0.5000
    Valid | 1/1 | loss: 6.2234 | fps: 1.6978 | acc: 0.0000

    Epoch 7/8

    Train | 1/1 | loss: 6.1252 | fps: 1.4416 | acc: 0.2500
    Valid | 1/1 | loss: 6.1999 | fps: 1.7895 | acc: 0.0000

    Epoch 8/8

    Train | 1/1 | loss: 6.1179 | fps: 1.4245 | acc: 0.5000
    Valid | 1/1 | loss: 6.1874 | fps: 1.6070 | acc: 0.0000

    opened by AliAmjad 5
  • InceptionResnetV1: UnpicklingError: invalid load key, '<'.

    InceptionResnetV1: UnpicklingError: invalid load key, '<'.

    Hi! This repo is one of the best libraries to perform the task. I have been using your package for the last few weeks on a daily basis without encountering any issues. However, since the last few hours (3 July 2020 GMT+5:30), upon instantiating the resnet model with pretrained models (both, vggface2 and casia-webface) in the following way: resnet = InceptionResnetV1(pretrained='vggface2') The following error is being generated on every run:

    ---------------------------------------------------------------------------
    UnpicklingError                           Traceback (most recent call last)
    <ipython-input-76-a829445a43a7> in <module>()
    ----> 1     resnet = InceptionResnetV1(pretrained='vggface2')
    
    3 frames
    /usr/local/lib/python3.6/dist-packages/facenet_pytorch/models/inception_resnet_v1.py in __init__(self, pretrained, classify, num_classes, dropout_prob, device)
        259 
        260         if pretrained is not None:
    --> 261             load_weights(self, pretrained)
        262 
        263         if self.num_classes is not None:
    
    /usr/local/lib/python3.6/dist-packages/facenet_pytorch/models/inception_resnet_v1.py in load_weights(mdl, name)
        334             with open(cached_file, 'wb') as f:
        335                 f.write(r.content)
    --> 336         state_dict.update(torch.load(cached_file))
        337 
        338     mdl.load_state_dict(state_dict)
    
    /usr/local/lib/python3.6/dist-packages/torch/serialization.py in load(f, map_location, pickle_module, **pickle_load_args)
        591                     return torch.jit.load(f)
        592                 return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
    --> 593         return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
        594 
        595 
    
    /usr/local/lib/python3.6/dist-packages/torch/serialization.py in _legacy_load(f, map_location, pickle_module, **pickle_load_args)
        761             "functionality.".format(type(f)))
        762 
    --> 763     magic_number = pickle_module.load(f, **pickle_load_args)
        764     if magic_number != MAGIC_NUMBER:
        765         raise RuntimeError("Invalid magic number; corrupt file?")
    
    UnpicklingError: invalid load key, '<'.
    

    My guess is that the pretrained model files were changed/corrupted. Please have a look at it.

    opened by deeox 5
  • Use it as a feature extrator

    Use it as a feature extrator

    Hi Tim, I would like to use this facenet as a feature extractor, i.e., delete the last few FC layers. My Code model = InceptionResnetV1(pretrained='vggface2', num_classes=8, classify=True).to(device) print(model) Result of last few layers: (avgpool_1a): AdaptiveAvgPool2d(output_size=1) (dropout): Dropout(p=0.6, inplace=False) (last_linear): Linear(in_features=1792, out_features=512, bias=False) (last_bn): BatchNorm1d(512, eps=0.001, momentum=0.1, affine=True, track_running_stats=True) (logits): Linear(in_features=512, out_features=8, bias=True) )' Then, I delete the last FC and BN, so I do the following model = nn.Sequential(*list(model.children())[:-2]) print("After Delete") print(model) Result of last few layers: (13): AdaptiveAvgPool2d(output_size=1) (14): Dropout(p=0.6, inplace=False) (15): Linear(in_features=1792, out_features=512, bias=False)

    Yes. This architecture is what I want.

    But, if I want to use this pretrained weighted to get a 512d output, then I have the following error: File "/home/xxx/anaconda3/envs/face_env/lib/python3.7/site/packages/torch/nn/functional.py", line 1371, in linear output = input.matmul(weight.t()) RuntimeError: size mismatch, m1: [3584 x 1], m2: [1792 x 512] at /opt/conda/conda-bld/pytorch_1565272271120/work/aten/src/THC/generic/THCTensorMathBlas.cu:273

    opened by FrankXinqi 5
  • Add methods for model training

    Add methods for model training

    The repo currently only includes models (MTCNN and InceptionResnetV1). It would be good to add code for updating pretrained models given new data or different training hyperparameters.

    enhancement 
    opened by timesler 5
  • TypeError: len() of unsized object

    TypeError: len() of unsized object

    I get this error when I read the image with PIL and even cv2. Traceback (most recent call last): File "preprocessing_siwm.py", line 81, in <module> mtcnn_caller(new_testset_path, image_folder+'/',image) File "preprocessing_siwm.py", line 49, in mtcnn_caller img_cropped = mtcnn(img, save_path=path+'spoof/'+image+'/'+image+'.png') File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.6/dist-packages/facenet_pytorch/models/mtcnn.py", line 262, in forward batch_boxes, batch_probs, batch_points, img, method=self.selection_method File "/usr/local/lib/python3.6/dist-packages/facenet_pytorch/models/mtcnn.py", line 408, in select_boxes if len(boxes) == 0: TypeError: len() of unsized object I tried checking if the image existed using if img is not None: But that doesn't help. Please guide me if I'm missing on something. Thanks a lot.

    opened by GayatriPurandharT 4
  • Compatibility for ndarray batches of images

    Compatibility for ndarray batches of images

    It was not possible to pass the mtcnn forward method a batch of np ndarrays images (4D images) stacked on the first dimension, even if described as possible in the documentation and working with the detection methods already implemented. I propose a little change to make it possible. I have tested it against a batch of np ndarray images, a list of ndarrays images, a list of PIL images, single PIL and np ndarrays images and it is working. Great package, thank you!

    opened by mawanda-jun 4
  • Fine-tuning & pre-whitening

    Fine-tuning & pre-whitening

    Thanks @timesler for the great work! I was trying to fine-tune on a small dataset and noticed that the network overfits very quickly. I think it's useful to train only the last linear layer ("logits" in your code). This worked fine for me:

    optimizer = optim.Adam(net.module.logits.parameters())
    

    Unrelated-ly, the pre-whiten transform may be screwing up the identification a bit as it normalizes each face using its own mean & std-dev. In one of my test videos, an African guy is identified as a white guy. So using a standard pre-processing for all faces may be a good idea. I see there's a related pull-request. Maybe I will experiment a bit myself next week.

    Thanks again.

    opened by cbasavaraj 4
  • 「...'aten::nonzero' is not currently supported...」on M1

    「...'aten::nonzero' is not currently supported...」on M1

    Problem: muti error occurred when I try to run the example from help(MTCNN) on my MacBook Air M1. Env: macOS Monterey 12.6 MacBook Air(M1, 2020) 8GB RAM iBoot 7459.141.1

    conda 4.13.0 Python 3.8.13 # packages in environment at /opt/homebrew/anaconda3/envs/pytorch: # # Name Version Build Channel bzip2 1.0.8 h3422bc3_4 conda-forge ca-certificates 2022.9.24 h4653dfc_0 conda-forge certifi 2022.9.24 pypi_0 pypi charset-normalizer 2.1.1 pypi_0 pypi facenet-pytorch 2.5.2 pypi_0 pypi idna 3.4 pypi_0 pypi libffi 3.4.2 h3422bc3_5 conda-forge libsqlite 3.39.4 h76d750c_0 conda-forge libzlib 1.2.12 h03a7124_4 conda-forge ncurses 6.3 h07bb92c_1 conda-forge numpy 1.23.4 pypi_0 pypi opencv-python 4.6.0.66 pypi_0 pypi openssl 3.0.5 h03a7124_2 conda-forge pillow 9.2.0 pypi_0 pypi pip 22.2.2 pyhd8ed1ab_0 conda-forge python 3.8.13 hd3575e6_0_cpython conda-forge readline 8.1.2 h46ed386_0 conda-forge requests 2.28.1 pypi_0 pypi setuptools 65.4.1 pyhd8ed1ab_0 conda-forge sqlite 3.39.4 h2229b38_0 conda-forge tk 8.6.12 he1e0b03_0 conda-forge torch 1.14.0.dev20221012 pypi_0 pypi torchvision 0.15.0.dev20221012 pypi_0 pypi typing-extensions 4.4.0 pypi_0 pypi urllib3 1.26.12 pypi_0 pypi wheel 0.37.1 pyhd8ed1ab_0 conda-forge xz 5.2.6 h57fd34a_0 conda-forge

    Result: (pytorch) Running on device: mps /opt/homebrew/anaconda3/envs/pytorch/lib/python3.8/site-packages/facenet_pytorch/models/utils/detect_face.py:210: UserWarning: The operator 'aten::nonzero' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.) mask_inds = mask.nonzero() Traceback (most recent call last): File "facenetMTCNN_example.py", line 13, in <module> boxes, probs, points = mtcnn.detect(img, landmarks=True) File "/opt/homebrew/anaconda3/envs/pytorch/lib/python3.8/site-packages/facenet_pytorch/models/mtcnn.py", line 313, in detect batch_boxes, batch_points = detect_face( File "/opt/homebrew/anaconda3/envs/pytorch/lib/python3.8/site-packages/facenet_pytorch/models/utils/detect_face.py", line 79, in detect_face pick = batched_nms(boxes_scale[:, :4], boxes_scale[:, 4], image_inds_scale, 0.5) File "/opt/homebrew/anaconda3/envs/pytorch/lib/python3.8/site-packages/torchvision/ops/boxes.py", line 75, in batched_nms return _batched_nms_coordinate_trick(boxes, scores, idxs, iou_threshold) File "/opt/homebrew/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/jit/_trace.py", line 1136, in wrapper return fn(*args, **kwargs) File "/opt/homebrew/anaconda3/envs/pytorch/lib/python3.8/site-packages/torchvision/ops/boxes.py", line 94, in _batched_nms_coordinate_trick keep = nms(boxes_for_nms, scores, iou_threshold) File "/opt/homebrew/anaconda3/envs/pytorch/lib/python3.8/site-packages/torchvision/ops/boxes.py", line 41, in nms return torch.ops.torchvision.nms(boxes, scores, iou_threshold) File "/opt/homebrew/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/_ops.py", line 442, in __call__ return self._op(*args, **kwargs or {}) NotImplementedError: The operator 'torchvision::nms' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

    Here's the code: from PIL import Image, ImageDraw from facenet_pytorch import MTCNN, extract_face import cv2 import torch

    device = 'mps' if torch.backends.mps.is_available() and torch.backends.mps.is_built() else 'cpu' print("Running on device: {}".format(device))

    #cap = cv2.VideoCapture(0) img = Image.open('./photos/Chris/16871.jpg')

    mtcnn = MTCNN(keep_all=True, device=device) boxes, probs, points = mtcnn.detect(img, landmarks=True) print(boxes, probs, points)

    img_draw = img.copy() draw = ImageDraw.Draw(img_draw) for i, (box, point) in enumerate(zip(boxes, points)):     draw.rectangle(box.tolist(), width=5)     for p in point:         draw.rectangle((p - 10).tolist() + (p + 10).tolist(), width=10)     extract_face(img, box, save_path='detected_face_{}.png'.format(i)) img_draw.save('annotated_faces.png')

    opened by changchiyou 0
  • ONNX conversion

    ONNX conversion

    Has anybody successfully converted the MTCNN to ONNX? Keep getting the error:

    ~\Anaconda3\envs\facenet\lib\site-packages\facenet_pytorch\models\utils\detect_face.py in detect_face(imgs, minsize, pnet, rnet, onet, threshold, factor, device)
         81         offset += boxes_scale.shape[0]
         82 
    ---> 83     boxes = torch.cat(boxes, dim=0)
         84     image_inds = torch.cat(image_inds, dim=0)
         85 
    
    RuntimeError: torch.cat(): expected a non-empty list of Tensors
    
    opened by Fritskee 0
  • output embeddings dimension is weird

    output embeddings dimension is weird

    my input is torch.Size([1, 3, 160, 160]). Why is the output dimensions torch.Size([1, 1792, 3, 3]) and not 512?

    I initialize the model like this - resnet = InceptionResnetV1('vggface2').eval()

    opened by GiilDe 0
  • Unexpected EOF Error for Resnet

    Unexpected EOF Error for Resnet

    Running the following code produces an "Unexpected end of file" error:

    resnet = InceptionResnetV1(pretrained='vggface2').eval()

    Traceback (most recent call last): File "<pyshell#4>", line 1, in resnet = InceptionResnetV1(pretrained='vggface2').eval() File "C:\Users\AVezey\AppData\Local\Programs\Python\Python310\lib\site-packages\facenet_pytorch\models\inception_resnet_v1.py", line 262, in init load_weights(self, pretrained) File "C:\Users\AVezey\AppData\Local\Programs\Python\Python310\lib\site-packages\facenet_pytorch\models\inception_resnet_v1.py", line 329, in load_weights state_dict = torch.load(cached_file) File "C:\Users\AVezey\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\serialization.py", line 713, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "C:\Users\AVezey\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\serialization.py", line 938, in _legacy_load typed_storage._storage._set_from_file( RuntimeError: unexpected EOF, expected 386827 more bytes. The file might be corrupted.

    opened by avezey-ci 0
  • numpy throwing deprecation warning for creating ndarray from nested s…

    numpy throwing deprecation warning for creating ndarray from nested s…

    Numpy is throwing deprecation warnings for creating arrays from nested sequences on line 183 of detect_face.py and on lines 339, 340, 341 of mtcnn.py when running the example training script. The fix is to pass 'dtype=object' as a parameter when creating the ndarray. E.g., on line 339 of mtcnn.py np.array(boxes) becomes np.array(boxes, dtype=object)

    opened by MarkhamLee 0
Owner
Tim Esler
ML, Data Science, AI, Physics, Computational Neuroscience, Biomedical Engineering
Tim Esler
Official Implementation for "ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement" https://arxiv.org/abs/2104.02699

ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement Recently, the power of unconditional image synthesis has significantly advanced th

967 Jan 04, 2023
Capsule endoscopy detection DACON challenge

capsule_endoscopy_detection (DACON Challenge) Overview Yolov5, Yolor, mmdetection기반의 모델을 사용 (총 11개 모델 앙상블) 모든 모델은 학습 시 Pretrained Weight을 yolov5, yolo

MAILAB 11 Nov 25, 2022
Adaptive FNO transformer - official Pytorch implementation

Adaptive Fourier Neural Operators: Efficient Token Mixers for Transformers This repository contains PyTorch implementation of the Adaptive Fourier Neu

NVIDIA Research Projects 77 Dec 29, 2022
Code for the CVPR2021 paper "Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition"

Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition This repository contains code for the CVPR2021 paper "Patch-NetV

QVPR 368 Jan 06, 2023
CAST: Character labeling in Animation using Self-supervision by Tracking

CAST: Character labeling in Animation using Self-supervision by Tracking (Published as a conference paper at EuroGraphics 2022) Note: The CAST paper c

15 Nov 18, 2022
ML course - EPFL Machine Learning Course, Fall 2021

EPFL Machine Learning Course CS-433 Machine Learning Course, Fall 2021 Repository for all lecture notes, labs and projects - resources, code templates

EPFL Machine Learning and Optimization Laboratory 1k Jan 04, 2023
A trashy useless Latin programming language written in python.

Codigum! The first programming langage in latin! (please keep your eyes closed when if you read the source code) It is pretty useless though. Document

Bic 2 Oct 25, 2021
Implementing DeepMind's Fast Reinforcement Learning paper

Fast Reinforcement Learning This is a repo where I implement the algorithms in the paper, Fast reinforcement learning with generalized policy updates.

Marcus Chiam 6 Nov 28, 2022
House_prices_kaggle - Predict sales prices and practice feature engineering, RFs, and gradient boosting

House Prices - Advanced Regression Techniques Predicting House Prices with Machine Learning This project is build to enhance my knowledge about machin

Gurpreet Singh 1 Jan 01, 2022
《K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters》(2020)

K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters This repository is the implementation of the paper "K-Adapter: Infusing Knowledge

Microsoft 118 Dec 13, 2022
Learning Calibrated-Guidance for Object Detection in Aerial Images

Learning Calibrated-Guidance for Object Detection in Aerial Images arxiv We propose a simple yet effective Calibrated-Guidance (CG) scheme to enhance

51 Sep 22, 2022
Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021)

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation (CVPR 2021, official Pytorch implementatio

Microsoft 247 Dec 25, 2022
Official Implementation of HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation

HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation by Lukas Hoyer, Dengxin Dai, and Luc Van Gool [Arxiv] [Paper] Overview Unsup

Lukas Hoyer 149 Dec 28, 2022
Code for unmixing audio signals in four different stems "drums, bass, vocals, others". The code is adapted from "Jukebox: A Generative Model for Music"

Status: Archive (code is provided as-is, no updates expected) Disclaimer This code is a based on "Jukebox: A Generative Model for Music" Paper We adju

Wadhah Zai El Amri 24 Dec 29, 2022
Training DiffWave using variational method from Variational Diffusion Models.

Variational DiffWave Training DiffWave using variational method from Variational Diffusion Models. Quick Start python train_distributed.py discrete_10

Chin-Yun Yu 26 Dec 13, 2022
This project is based on RIFE and aims to make RIFE more practical for users by adding various features and design new models

CPM 项目描述 CPM(Chinese Pretrained Models)模型是北京智源人工智能研究院和清华大学发布的中文大规模预训练模型。官方发布了三种规模的模型,参数量分别为109M、334M、2.6B,用户需申请与通过审核,方可下载。 由于原项目需要考虑大模型的训练和使用,需要安装较为复杂

hzwer 190 Jan 08, 2023
Face recognize system

FRS Face_recognize_system This project contains my work that target on solving some problems of FRS: Face detection: Retinaface Face anti-spoofing: Fo

Tran Anh Tuan 4 Nov 18, 2021
PlenOctrees: NeRF-SH Training & Conversion

PlenOctrees Official Repo: NeRF-SH training and conversion This repository contains code to train NeRF-SH and to extract the PlenOctree, constituting

Alex Yu 323 Dec 29, 2022
Rule Extraction Methods for Interactive eXplainability

REMIX: Rule Extraction Methods for Interactive eXplainability This repository contains a variety of tools and methods for extracting interpretable rul

Mateo Espinosa Zarlenga 21 Jan 03, 2023
We will see a basic program that is basically a hint to brute force attack to crack passwords. In other words, we will make a program to Crack Any Password Using Python. Show some ❤️ by starring this repository!

Crack Any Password Using Python We will see a basic program that is basically a hint to brute force attack to crack passwords. In other words, we will

Ananya Chatterjee 11 Dec 03, 2022