A collection of common regular expressions bundled with an easy to use interface.

Overview

CommonRegex

Find all times, dates, links, phone numbers, emails, ip addresses, prices, hex colors, and credit card numbers in a string. We did the hard work so you don't have to.

Pull requests welcome!

Installation

Install via pip

sudo pip install commonregex

or via setup.py

python setup.py install

Usage

>>> from commonregex import CommonRegex
>>> parsed_text = CommonRegex("""John, please get that article on www.linkedin.com to me by 5:00PM 
                               on Jan 9th 2012. 4:00 would be ideal, actually. If you have any 
                               questions, You can reach me at (519)-236-2723x341 or get in touch with
                               my associate at [email protected]""")
>>> parsed_text.times
['5:00PM', '4:00']
>>> parsed_text.dates
['Jan 9th 2012']
>>> parsed_text.links
['www.linkedin.com']
>>> parsed_text.phones
['(519)-236-2727']
>>> parsed_text.phones_with_exts
['(519)-236-2723x341']
>>> parsed_text.emails
['[email protected]']

Alternatively, you can generate a single CommonRegex instance and use it to parse multiple segments of text.

>>> parser = CommonRegex()
>>> parser.times("When are you free?  Do you want to meet up for coffee at 4:00?")
['4:00']

Finally, all regular expressions used are publicly exposed.

>>> from commonregex import email
>>> import re
>>> text = "...get in touch with my associate at [email protected]"
>>> re.sub(email, "[email protected]", text)
'...get in touch with my associate at [email protected]'
>>> from commonregex import time
>>> for m in time.finditer("Does 6:00 or 7:00 work better?"):
>>>     print m.start(), m.group()     
5 6:00 
13 7:00 

Please note that this module is currently English/US specific.

Supported Methods/Attributes

  • obj.dates, obj.dates()
  • obj.times, obj.times()
  • obj.phones, obj.phones()
  • obj.phones_with_exts, obj.phones_with_exts()
  • obj.links, obj.links()
  • obj.emails, obj.emails()
  • obj.ips, obj.ips()
  • obj.ipv6s, obj.ipv6s()
  • obj.prices, obj.prices()
  • obj.hex_colors, obj.hex_colors()
  • obj.credit_cards, obj.credit_cards()
  • obj.btc_addresses, obj.btc_addresses()
  • obj.street_addresses, obj.street_addresses()
  • obj.zip_codes, obj.zip_codes()
  • obj.po_boxes, obj.po_boxes()
  • obj.ssn_number, obj.ssn_number()

CommonRegex Ports:

CommonRegexRust

[CommonRegexJS] (https://github.com/talyssonoc/CommonRegexJS)

[CommonRegexScala] (https://github.com/everpeace/CommonRegexScala)

[CommonRegexJava] (https://github.com/talyssonoc/CommonRegexJava)

[CommonRegexCobra] (https://github.com/PurityLake/CommonRegex-Cobra)

[CommonRegexDart] (https://github.com/aufdemrand/CommonRegexDart)

[CommonRegexRuby] (https://github.com/talyssonoc/CommonRegexRuby)

[CommonRegexPHP] (https://github.com/james2doyle/CommonRegexPHP)

Analytics

Owner
Madison May
Machine Learning Architect at @IndicoDataSolutions
Madison May
Ked interpreter built with Lex, Yacc and Python

Ked Ked is the first programming language known to hail from The People's Republic of Cork. It was first discovered and partially described by Adam Ly

Eoin O'Brien 1 Feb 08, 2022
bib2xml - A tool for getting Word formatted XML from Bibtex files

bib2xml - A tool for getting Word formatted XML from Bibtex files Processes Bibtex files (.bib), produces Word Bibliography XML (.xml) output Why not

Matheus Sartor 1 May 05, 2022
The last walk-through project in code institute diploma course

Welcome Rocky.C, This is the Code Institute student template for Gitpod. We have preinstalled all of the tools you need to get started. It's perfectly

Rocky.C 1 Jan 31, 2022
This is a repository containing the backend and the frontend of a simple pokédex.

Pokémon This is a repository containing the backend and the frontend of a simple pokédex. This is a work in progress project! Project Structure 🗂 pok

André Rato 1 Nov 28, 2021
A new mini-batch framework for optimal transport in deep generative models, deep domain adaptation, approximate Bayesian computation, color transfer, and gradient flow.

BoMb-OT Python3 implementation of the papers On Transportation of Mini-batches: A Hierarchical Approach and Improving Mini-batch Optimal Transport via

Khai Ba Nguyen 18 Nov 14, 2022
Checks for Vaccine Availability at your district and notifies you using E-mail, subscribe to our website.

Vaccine Availability Notifier Project Description Checks for Vaccine Availability at your district and notifies you using E-mail every 10 mins. Kindly

Farhan Hai Khan 19 Jun 03, 2021
Project Interface For nextcord-ext

Project Interface For nextcord-ext

nextcord-ext 1 Nov 13, 2021
GUI for the Gammu library.

Wammu GUI for the Gammu library. Homepage https://wammu.eu/ License GNU GPL version 3 or later. First start On first start you will be asked for set

Gammu 60 Dec 14, 2022
Automate the boilerplate while initializing your Python project

Rubric Automate the boilerplate while initializing your Python project Preface Rubric is an opinionated project initializer for Python. It assum

Redowan Delowar 23 Dec 16, 2022
Python Freecell Solver

freecell Python Freecell Solver Very early version right now. You can pick a board by changing the file path in freecell.py If you want to play a game

Ben Kaufman 1 Nov 26, 2021
CaskDB is a disk-based, embedded, persistent, key-value store based on the Riak's bitcask paper, written in Python.

CaskDB - Disk based Log Structured Hash Table Store CaskDB is a disk-based, embedded, persistent, key-value store based on the Riak's bitcask paper, w

886 Dec 27, 2022
HSPyLib is a Python library that will elevate your experience to another level.

HomeSetup Python Library - HSPyLib Your mature python application HSPyLib is a Python library that will elevate your experience to another level. It r

Hugo Saporetti Junior 4 Dec 14, 2022
Import Apex legends mprt files exported from Legion

Apex-mprt-importer-for-Blender Import Apex legends mprt files exported from Legion. REQUIRES CAST IMPORTER Usage: Use a VPK extracter to extract the m

15 Dec 18, 2022
A small program to vote for Councilors at 42 Heilbronn.

This Docker container is build to run on server an provide an easy to use interface for every student to vote for their councillors. To run docker on

Kevin Hirsig 2 Jan 17, 2022
UF3: a python library for generating ultra-fast interatomic potentials

Ultra-Fast Force Fields (UF3) S. R. Xie, M. Rupp, and R. G. Hennig, "Ultra-fast interpretable machine-learning potentials", preprint arXiv:2110.00624

Ultra-Fast Force Fields 24 Nov 13, 2022
The FLARE team's open-source library to disassemble Common Intermediate Language (CIL) instructions.

dncil is a Common Intermediate Language (CIL) disassembly library written in Python that supports parsing the header, instructions, and exception hand

MANDIANT 95 Jan 08, 2023
The bidirectional mapping library for Python.

bidict The bidirectional mapping library for Python. Status bidict: has been used for many years by several teams at Google, Venmo, CERN, Bank of Amer

Joshua Bronson 1.2k Dec 31, 2022
Your E-Canteen that is convenient and accessible wherever you are in the campus

Food Web E-Canteen System Your E-Canteen that is convenient and accessible wherever you are in the campus. Table of Contents About The Project Contrib

Pudding 5 Jan 07, 2023
A very simple boarding app with DRF

CRUD project with DRF A very simple boarding app with DRF. About The Project 유저 정보를 갖고 게시판을 다루는 프로젝트 입니다. Version Python: 3.9 DB: PostgreSQL 13 Django

1 Nov 13, 2021
ChainJacking is a tool to find which of your Go lang direct GitHub dependencies is susceptible to ChainJacking attack.

ChainJacking is a tool to find which of your Go lang direct GitHub dependencies is susceptible to ChainJacking attack.

Checkmarx 36 Nov 02, 2022