About Library for extract infomation from thai personal identity card.

Overview

ThaiPersonalCardExtract

Downloads PyPI Status license Instragram

Library for extract infomation from thai personal identity card. imprement from easyocr and tesseract

New Feature v1.3.2 🎁

  • Increase performance.
  • Support Thai Government Lottery สกัดข้อมูลจากลอตเตอร์รี่ ใช้ได้ดีกับรูปภาพที่ได้จากเครื่องแสกน (16 Aug. 2021)
  • Refactor Output Structure.
  • Support Thai Driving License (Beta) สามารถสกัดข้อมูลจากภาพถ่ายใบขับขี่ได้บางรูปแบบ เนื่องจาก กรมทางขนส่งทางบก มีรูปแบบบัตรหลากหลายรูปแบบ และแต่ละรูปแบบมีตำแหน่งข้อมูลที่แตกต่างกัน จึงทำให้ประสิทธิภาพต่ำ

Examples

Example image file.

Real image file Real image file Real image file

wrapPerpective image crop.

wrapPerpective image crop wrapPerpective image crop

keypoint of image detected.

keypoint of image detected

Resutls of library extract region of interest

Identification Number

FullNameTH

NameEN

LastNameEN

BirthdayTH

BirthdayEN

Religion

Address

DateOfIssueTH

DateOfIssueEN

DateOfExpiryTH

DateOfExpiryEN

Recommend

  • Image quality lowest should be 600x350
  • Images with minimal reflections should be used. for good results
  • Identity Card should be size in the image about 75%, if the image doesn't cropped that to be left only Identity Card area.
  • For faster, please resize image and usage CUDA GPU.

Installation

Install using pip for stable release,

pip install thai-personal-card-extract

For latest development release,

pip install git+git://github.com/ggafiled/ThaiPersonalCardExtrac.git

Note 1: for Windows, please install tesseract first by following the official instruction here https://medium.com/@navapat.tpb/734dae2fb4d3 On medium website, be sure to setup already.

Note 2: for Linux os, please install tesseract by following the official instruction https://github.com/tesseract-ocr/tesseract

Usage

# With build-in Config Options. 

import ThaiPersonalCardExtract as card
reader = card.PersonalCard(
    lang=card.THAI,
    provider=card.DEFAULT,
    tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract",
    save_extract_result=True,
    path_to_save="D:/dev/ThaiPersonalCardExtract/examples/extract")
result = reader.extractInfo('examples/card.jpg')
print(result)


# With free-style ตัวอย่างการเรียกใช้งานคลาส PersonalCard เพื่อสกัดข้อมูลบัตรประจำตัวประชาชน 

from ThaiPersonalCardExtract import PersonalCard
reader = PersonalCard(lang="mix", tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo('examples/card.jpg')
print(result)


# With free-style ตัวอย่างการเรียกใช้งานคลาส DrivingLicense เพื่อสกัดข้อมูลใบอนุญาตขับขี่

from ThaiPersonalCardExtract import DrivingLicense
reader = DrivingLicense(lang="mix", tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo('examples/card.jpg')
print(result)


# With free-style ตัวอย่างการเรียกใช้งานคลาส ThaiGovernmentLottery เพื่อสกัดข้อมูลลอตเตอร์รี่

from ThaiPersonalCardExtract import ThaiGovernmentLottery
reader = ThaiGovernmentLottery(save_extract_result=True, path_to_save="D:/dev/ThaiPersonalCardExtract/examples/extract/thai_government_lottery") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo("../examples/card7.jpg")
print(result)

Output will be in list format, each item represents result of library can extract, respectively. type of namedtuple ผลลัพธ์ที่ได้จะเป็นประเภท namedtuple สามารถศึกษาเพิ่มเติมเพื่อใช้งานได้จากที่นี่ คลิก

#Output of PersonalCard
    Card(Identification_Number='9999999999999', FullNameTH='นาย อายุมฺมุราเสะ', PrefixTH='นาย', NameTH='อายุมฺมุราเสะ', LastNameTH='อายุมฺมุราเสะ', PrefixEN='.Mr.Shoyo', NameEN='', LastNameEN='Hinatao', BirthdayTH='21 มี.ย. 2539', BirthdayEN='21 Jun..1996', Religion='พุทธ', Address='ท8ปฺ` 99/1 มิซีโฮะ เขตฮานามิกาวา อำเภอชิบ', DateOfIssueTH='11 ส.ค. 2554', DateOfIssueEN='11 Ang. 2021', DateOfExpiryTH='11 ส.ค. 2574', DateOfExpiryEN='11 Aug. 2031,')

#Output of DrivingLicense
    Card(License_Number='98765432', IssueDateTH='ผังทาทม', ExpiryDateTH='', IssueDateEN='14 August 2664', ExpiryDateEN='14 August 2574', NameTH='า? โนบกะ โนบี', NameEN='MRONOREAUMANE', BirthDayTH='', BirthDayEN='wa hs OKRA', Identity_Number='', Province='นคาราชศีมา')

#Output of ThaiGovernmentLottery
    Lottery(LotteryNumber='424603', LessonNumber='08', SetNumber='23', Year='2564') #type namedtuple 
    
 สามารถเข้าถึงตัวแปรได้ตามรูปแบบนี้
 print(result.LotteryNumber)
 print(result.LessonNumber)
 print(result.SetNumber)
 print(result.Year)

For set lang attribute to tha

from ThaiPersonalCardExtract import PersonalCard
reader = PersonalCard(lang="tha", tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo('examples/card.jpg')
print(result)

Output will be in list format, each item represents result of library can extract, respectively.

{
   "Identification_Number": "9999999999999",
   "FullNameTH": "นาย อายุมฺมุราเสะ",
   "PrefixTH": "นาย",
   "NameTH": "อายุมฺมุราเสะ",
   "LastNameTH": "อายุมฺมุราเสะ",
   "BirthdayTH": "21 มี.ย. 2539",
   "Religion": "พุทธ",
   "Address": "ท๒ 99/1 มิชีโฮะ เขตฮานามิกาวา อำเภอชิบ;",
   "DateOfIssueTH": "11 ส.ค. 2554",
   "DateOfExpiryTH": "11 ส.ค. 2574"
}

And you can set ocr provider following below default #used both easyocr and tesseract **Recommend Or easyocr Or tesseract

from ThaiPersonalCardExtract import PersonalCard
reader = PersonalCard(lang="tha", provider="default", tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo('examples/card.jpg')
print(result)

Config Options

you can set options to Instance by below keyword

Parameter name Value Type Example
lang String Expected Results Language bash mix #get all area both tha and eng Or bash tha Or bash eng *Default is 'mix' สำหรับ DrivingLicense, PersonalCard
provider String OCR Provider have bash default #used both easyocr and tesseract **Recommend Or bash easyocr Or bash tesseract *Default is 'default' สำหรับ DrivingLicense, PersonalCard
template_threshold Double Rate to cals similarity of template *Default is 0.7
sift_rate Int Feature Keypoint rate *Default is 25,000
tesseract_cmd String Path of your tesseract command **For windows only.
save_extract_result Boolean Set True if you want to save extracted image *Default is False
path_to_save String Path that you given it save extracted image, relative with save_extract_result=True

Donate Me

promptpay

Mr.Nattapol Krobklang

You might also like...
extract gene TSS/TES site form gencode/ensembl/gencode database GTF file and export bed format file.

GetTsite python Package extract gene TSS/TES site form gencode/ensembl/gencode database GTF file and export bed format file. Install $ pip install Get

A functional standard library for Python.

Toolz A set of utility functions for iterators, functions, and dictionaries. See the PyToolz documentation at https://toolz.readthedocs.io LICENSE New

🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.

Boltons boltons should be builtins. Boltons is a set of over 230 BSD-licensed, pure-Python utilities in the same spirit as — and yet conspicuously mis

Retrying library for Python

Tenacity Tenacity is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just

Retrying is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just about anything.

Retrying Retrying is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just

isort is a Python utility / library to sort imports alphabetically, and automatically separated into sections and by type.
isort is a Python utility / library to sort imports alphabetically, and automatically separated into sections and by type.

isort is a Python utility / library to sort imports alphabetically, and automatically separated into sections and by type. It provides a command line utility, Python library and plugins for various editors to quickly sort all your imports.

A Python library for reading, writing and visualizing the OMEGA Format
A Python library for reading, writing and visualizing the OMEGA Format

A Python library for reading, writing and visualizing the OMEGA Format, targeted towards storing reference and perception data in the automotive context on an object list basis with a focus on an urban use case.

RapidFuzz is a fast string matching library for Python and C++

RapidFuzz is a fast string matching library for Python and C++, which is using the string similarity calculations from FuzzyWuzzy

pydsinternals - A Python native library containing necessary classes, functions and structures to interact with Windows Active Directory.
pydsinternals - A Python native library containing necessary classes, functions and structures to interact with Windows Active Directory.

pydsinternals - Directory Services Internals Library A Python native library containing necessary classes, functions and structures to interact with W

Comments
  • 'PersonalCard' object has no attribute 'extractInfo'

    'PersonalCard' object has no attribute 'extractInfo'


    AttributeError Traceback (most recent call last) /var/folders/cc/d15mjmhn5c5fscqfwq3fql5h0000gp/T/ipykernel_6920/1182933257.py in ----> 1 result = reader.extractInfo('ThaiPersonalCardExtract/examples/extract/image_scan.jpg') 2 print(result)

    AttributeError: 'PersonalCard' object has no attribute 'extractInfo'

    opened by suwika 0
Releases(v1.3.4)
  • v1.3.4(Sep 2, 2021)

    New Feature v1.3.4 🎁

    • Support Thai identity card laser code extract. (02 Sep. 2021)
    • Fix bug dataset folder not import thai_government_lottery resource. (23 Aug. 2021) #1
    • Increase performance.
    • Support Thai Government Lottery สกัดข้อมูลจากลอตเตอร์รี่ ใช้ได้ดีกับรูปภาพที่ได้จากเครื่องแสกน (16 Aug. 2021)
    • Refactor Output Structure.
    • Support Thai Driving License (Beta) สามารถสกัดข้อมูลจากภาพถ่ายใบขับขี่ได้บางรูปแบบ เนื่องจาก กรมทางขนส่งทางบก มีรูปแบบบัตรหลากหลายรูปแบบ และแต่ละรูปแบบมีตำแหน่งข้อมูลที่แตกต่างกัน จึงทำให้ประสิทธิภาพต่ำ
    Source code(tar.gz)
    Source code(zip)
  • v1.3.1(Aug 16, 2021)

    New Feature v1.3.1 🎁 Increase performance. Support Thai Driving License (Beta) สามารถสกัดข้อมูลจากภาพถ่ายใบขับขี่ได้บางรูปแบบ เนื่องจาก กรมทางขนส่งทางบก มีรูปแบบบัตรหลากหลายรูปแบบ และแต่ละรูปแบบมีตำแหน่งข้อมูลที่แตกต่างกัน จึงทำให้ประสิทธิภาพต่ำ Support Thai Government Lottery (16 Aug. 2021)

    Source code(tar.gz)
    Source code(zip)
  • v1.3.0(Aug 14, 2021)

    Increase performance. Support Thai Driving License (Beta) สามารถสกัดข้อมูลจากภาพถ่ายใบขับขี่ได้บางรูปแบบ เนื่องจาก กรมทางขนส่งทางบก มีรูปแบบบัตรหลากหลายรูปแบบ และแต่ละรูปแบบมีตำแหน่งข้อมูลที่แตกต่างกัน จึงทำให้ประสิทธิภาพต่ำ ปรับเปลี่ยนรูปแบบไฟล์ระบบ

    Source code(tar.gz)
    Source code(zip)
  • v1.2.1(Aug 13, 2021)

    New Feature 🎁

    • More arae extract.
    • lang : attribute : get only area of given language.
    • provider : attribute : set ocr provider now support easyocr and tesseract.
    Source code(tar.gz)
    Source code(zip)
  • v1.0-beta(Aug 11, 2021)

Owner
ggafiled
นานๆที อัพ ครับบบ.
ggafiled
This is a tool to calculate a resulting color of the alpha blending process.

blec: alpha blending calculator This is a tool to calculate a resulting color of the alpha blending process. A gamma correction is enabled and the def

Igor Mikushkin 12 Sep 07, 2022
Etherium unit conversation and arithmetic library

etherunit Etherium unit conversation and arithmetic library Install pip install -u etherunit Usage from etherunit import Ether, Gwei, Wei, E Creat

Yasin Özel 1 Nov 10, 2021
Just some scripts to export vector tiles to geojson.

Vector tiles to GeoJSON Nowadays modern web maps are usually based on vector tiles. The great thing about vector tiles is, that they are not just imag

Lilith Wittmann 77 Jul 26, 2022
A plugin to simplify creating multi-page Dash apps

Multi-Page Dash App Plugin A plugin to simplify creating multi-page Dash apps. This is a preview of functionality that will of Dash 2.1. Background Th

Plotly 19 Dec 09, 2022
A python module to update the console without flashing.

A python module to update the console without flashing.

Matthias 112 Dec 19, 2022
Extract the download URL from OneDrive or SharePoint share link and push it to aria2

OneDriveShareLinkPushAria2 Extract the download URL from OneDrive or SharePoint share link and push it to aria2 从OneDrive或SharePoint共享链接提取下载URL并将其推送到a

高玩梁 262 Jan 08, 2023
NFT-Generator is the best way to generate thousands of NFTs quick and easily with Python.

NFT-Generator is the best way to generate thousands of NFTs quick and easily with Python. Just add your files, set your configuration and run the scri

78 Dec 27, 2022
Set of scripts for some automation during Magic Lantern development

~kitor Magic Lantern scripts A few automation scripts I wrote to automate some things in my ML development efforts. Used only on Debian running over W

Kajetan Krykwiński 1 Jan 03, 2022
Simple profile athena generator for Fortnite Private Servers.

Profile-Athena-Generator A simple profile athena generator for Fortnite Private Servers. This profile athena generrator features: Item variants Get al

Fevers 10 Aug 27, 2022
Simple script to export contacts from telegram into vCard file

Telegram Contacts Exporter Simple script to export contacts from telegram into vCard file Getting Started Prerequisites You must to put your Telegram

Pere Antoni 1 Oct 17, 2021
Set of utilities for exporting/controlling your robot in Blender

Blender Robotics Utils This repository contains utilities for exporting/controlling your robot in Blender Maintainers This repository is maintained by

Robotology 33 Nov 30, 2022
JavaScript-style async programming for Python.

promisio JavaScript-style async programming for Python. Examples Create a promise-based async function using the promisify decorator. It works on both

Miguel Grinberg 191 Dec 30, 2022
These scripts look for non-printable unicode characters in all text files in a source tree

find-unicode-control These scripts look for non-printable unicode characters in all text files in a source tree. find_unicode_control.py should work w

Siddhesh Poyarekar 25 Aug 30, 2022
✨ Un générateur d'adresse IP aléatoire totalement fait en Python par moi, et en français.

IP Generateur ❗ Un générateur d'adresse IP aléatoire totalement fait en Python par moi, et en français. 🔮 Avec l'utilisation du module "random", j'ai

MrGabin 3 Jun 06, 2021
a demo show how to dump lldb info to ida.

用一个demo来聊聊动态trace 这个仓库能做什么? 帮助理解动态trace的思想。仓库内的demo,可操作,可实践。 动态trace核心思想: 动态记录一个函数内每一条指令的执行中产生的信息,并导入IDA,用来弥补IDA等静态分析工具的不足。 反编译看一下 先clone仓库,把hellolldb

25 Nov 28, 2022
A Python package implementing various colour checker detection algorithms and related utilities.

A Python package implementing various colour checker detection algorithms and related utilities.

colour-science 147 Dec 29, 2022
Utility to extract Fantasy Grounds Unity Line-of-sight and lighting files from a Univeral VTT file exported from Dungeondraft

uvtt2fgu Utility to extract Fantasy Grounds Unity Line-of-sight and lighting files from a Univeral VTT file exported from Dungeondraft This program wo

Andre Kostur 29 Dec 05, 2022
.bvh to .mcfunction file converter.

bvh-to-mcf .bvh file to .mcfunction converter

Hanmin Kim 28 Nov 21, 2022
A tool to create the basics of a project

Project-Scheduler Instalação Para instalar o Project Maker, você necessita está em um ambiente de desenvolvimento Linux ou wsl com alguma distro debia

2 Dec 17, 2021
'ToolBurnt' A Set Of Tools In One Place =}

'ToolBurnt' A Set Of Tools In One Place =}

MasterBurnt 5 Sep 10, 2022