Deep Learning Slide Captcha

Overview

滑动验证码深度学习识别

本项目使用深度学习 YOLOV3 模型来识别滑动验证码缺口,基于 https://github.com/eriklindernoren/PyTorch-YOLOv3 修改。

只需要几百张缺口标注图片即可训练出精度高的识别模型,识别效果样例:

克隆项目

运行命令:

git clone https://github.com/Python3WebSpider/DeepLearningSlideCaptcha.git

数据准备

使用 LabelImg 工具标注自行标注一批数据,大约 200 张以上即可训练出不错的效果。

LabelImg:https://github.com/tzutalin/labelImg

标注要求:

  • 圈出验证码目标滑块区域的完整完整矩形,无需标注源滑块。
  • 目标矩形命名为 target 这个类别。
  • 建议使用 LabelImg 的快捷键提高标注效率。

环境准备

建议在 GPU 环境和虚拟 Python 环境下执行如下命令:

pip3 install -r requirements.txt

预训练模型下载

YOLOV3 的训练要加载预训练模型才能有不错的训练效果,预训练模型下载:

bash prepare.sh

下载完成之后会在 weights 文件夹下出现模型权重文件,供训练使用。

训练

本项目已经提供了标注好的数据集,在 data/captcha,可以直接使用。

如果要训练自己的数据,数据格式准备见:https://github.com/eriklindernoren/PyTorch-YOLOv3#train-on-custom-dataset

当前数据训练脚本:

bash train.sh

实测 P100 训练时长约 15 秒一个 epoch,大约几分钟即可训练出较好效果。

测试

训练完毕之后会在 checkpoints 文件夹生成 pth 文件,可直接使用模型来预测生成标注结果。

此时 checkpoints 文件夹会生成训练好的 pth 文件。

当前数据测试脚本:

sh detect.sh

该脚本会读取 captcha 下的 test 文件夹所有图片,并将处理后的结果输出到 test 文件夹。

运行结果样例:

Performing object detection:
        + Batch 0, Inference Time: 0:00:00.044223
        + Batch 1, Inference Time: 0:00:00.028566
        + Batch 2, Inference Time: 0:00:00.029764
        + Batch 3, Inference Time: 0:00:00.032430
        + Batch 4, Inference Time: 0:00:00.033373
        + Batch 5, Inference Time: 0:00:00.027861
        + Batch 6, Inference Time: 0:00:00.031444
        + Batch 7, Inference Time: 0:00:00.032110
        + Batch 8, Inference Time: 0:00:00.029131

Saving images:
(0) Image: 'data/captcha/test/captcha_4497.png'
        + Label: target, Conf: 0.99999
(1) Image: 'data/captcha/test/captcha_4498.png'
        + Label: target, Conf: 0.99999
(2) Image: 'data/captcha/test/captcha_4499.png'
        + Label: target, Conf: 0.99997
(3) Image: 'data/captcha/test/captcha_4500.png'
        + Label: target, Conf: 0.99999
(4) Image: 'data/captcha/test/captcha_4501.png'
        + Label: target, Conf: 0.99997
(5) Image: 'data/captcha/test/captcha_4502.png'
        + Label: target, Conf: 0.99999
(6) Image: 'data/captcha/test/captcha_4503.png'
        + Label: target, Conf: 0.99997
(7) Image: 'data/captcha/test/captcha_4504.png'
        + Label: target, Conf: 0.99998
(8) Image: 'data/captcha/test/captcha_4505.png'
        + Label: target, Conf: 0.99998

样例结果:

协议

本项目基于开源 GNU 协议 ,另外本项目不提供任何有关滑动轨迹相关模拟和 JavaScript 逆向分析方案。

本项目仅供学习交流使用,请勿用于非法用途,本人不承担任何法律责任。

如有侵权请联系个人删除,谢谢。

Owner
Python3WebSpider
Python3WebSpider
Keyword2Text This repository contains the code of the paper: "A Plug-and-Play Method for Controlled Text Generation"

Keyword2Text This repository contains the code of the paper: "A Plug-and-Play Method for Controlled Text Generation", if you find this useful and use

57 Dec 27, 2022
This is a repository with the code for the ACL 2019 paper

The Story of Heads This is the official repo for the following papers: (ACL 2019) Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy

231 Nov 15, 2022
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

ONNX Runtime is a cross-platform inference and training machine-learning accelerator. ONNX Runtime inference can enable faster customer experiences an

Microsoft 8k Jan 04, 2023
TalkingHead-1KH is a talking-head dataset consisting of YouTube videos

TalkingHead-1KH Dataset TalkingHead-1KH is a talking-head dataset consisting of YouTube videos, originally created as a benchmark for face-vid2vid: On

173 Dec 29, 2022
This is the offical website for paper ''Category-consistent deep network learning for accurate vehicle logo recognition''

The Pytorch Implementation of Category-consistent deep network learning for accurate vehicle logo recognition This is the offical website for paper ''

Wanglong Lu 28 Oct 29, 2022
TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

SLM: Structural Language Models of Code This is an official implementation of the model described in: "Structural Language Models of Code" [PDF] To ap

73 Nov 06, 2022
a reimplementation of Holistically-Nested Edge Detection in PyTorch

pytorch-hed This is a personal reimplementation of Holistically-Nested Edge Detection [1] using PyTorch. Should you be making use of this work, please

Simon Niklaus 375 Dec 06, 2022
Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)

AudioCLIP Extending CLIP to Image, Text and Audio This repository contains implementation of the models described in the paper arXiv:2106.13043. This

458 Jan 02, 2023
RIM: Reliable Influence-based Active Learning on Graphs.

RIM: Reliable Influence-based Active Learning on Graphs. This repository is the official implementation of RIM. Requirements To install requirements:

Wentao Zhang 4 Aug 29, 2022
Attention Probe: Vision Transformer Distillation in the Wild

Attention Probe: Vision Transformer Distillation in the Wild Jiahao Wang, Mingdeng Cao, Shuwei Shi, Baoyuan Wu, Yujiu Yang In ICASSP 2022 This code is

Wang jiahao 3 Oct 31, 2022
Style transfer between images was performed using the VGG19 model

Style transfer between images was performed using the VGG19 model. The necessary codes, libraries and all other information of this project are available below

Onur yılmaz 2 May 09, 2022
Point Cloud Registration Network

PCRNet: Point Cloud Registration Network using PointNet Encoding Source Code Author: Vinit Sarode and Xueqian Li Paper | Website | Video | Pytorch Imp

ViNiT SaRoDe 59 Nov 19, 2022
A real-time speech emotion recognition application using Scikit-learn and gradio

Speech-Emotion-Recognition-App A real-time speech emotion recognition application using Scikit-learn and gradio. Requirements librosa==0.6.3 numpy sou

Son Tran 6 Oct 04, 2022
A denoising autoencoder + adversarial losses and attention mechanisms for face swapping.

faceswap-GAN Adding Adversarial loss and perceptual loss (VGGface) to deepfakes'(reddit user) auto-encoder architecture. Updates Date Update 2018-08-2

3.2k Dec 30, 2022
VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.

What's New Below we share, in reverse chronological order, the updates and new releases in VISSL. All VISSL releases are available here. [Oct 2021]: V

Meta Research 2.9k Jan 07, 2023
TLDR: Twin Learning for Dimensionality Reduction

TLDR (Twin Learning for Dimensionality Reduction) is an unsupervised dimensionality reduction method that combines neighborhood embedding learning with the simplicity and effectiveness of recent self

NAVER 105 Dec 28, 2022
Implementation for "Manga Filling Style Conversion with Screentone Variational Autoencoder" (SIGGRAPH ASIA 2020 issue)

Manga Filling with ScreenVAE SIGGRAPH ASIA 2020 | Project Website | BibTex This repository is for ScreenVAE introduced in the following paper "Manga F

30 Dec 24, 2022
Implementation of gaze tracking and demo

Predicting Customer Demand by Using Gaze Detecting and Object Tracking This project is the integration of gaze detecting and object tracking. Predict

2 Oct 20, 2022
Unrolled Generative Adversarial Networks

Unrolled Generative Adversarial Networks Luke Metz, Ben Poole, David Pfau, Jascha Sohl-Dickstein arxiv:1611.02163 This repo contains an example notebo

Ben Poole 292 Dec 06, 2022
Minimal diffusion models - Minimal code and simple experiments to play with Denoising Diffusion Probabilistic Models (DDPMs)

Minimal code and simple experiments to play with Denoising Diffusion Probabilist

Rithesh Kumar 16 Oct 06, 2022