Similar looking domain detection using python fuzzywuzzy

Overview

Similar-looking-domain-detection-using-python-fuzzywuzzy

Major cause of phishing and BEC incident is similar looking domain, if you detect it early, you can prevent incidents early, python fuzzywuzzy module let you do that and here is the process.

By statistics every day thousands of domains are registered, some are use for legit purpose and some are not. BEC incidents incresing every day and cost millions to businesses, the core of BEC is spoofed email that looks similar to your business email. Because of these similar looking domain we fall pray to BEC incidents. Sometimes we end up submitting our credential when we received any email having link that looks like genuine website. e.g. microsoft.com vs micr0soft.com.

How can we detect/prevent such incident?

In python you have module named "fuzzywuzzy" that looks for similarity in strings and gives score of how similar strings are, like 90% match, 66% match. Use that to look for simiar domains

  1. Gather data (SIEM) having domain related information e.g. Proxy logs, DNS logs, Mail logs.
  2. Have list of domains related to your business ( your owned domain, list of vendor domains with whom you carry out business )
  3. Now run fuzz module against this data and check ratio which is more than 50% ( e.g. given below )
  4. Do the analysis of domains (check whois data) which looks similar to your domain, if genuine add to list gathered in Step 2
  5. If domain is not genuine, start digging more on that domain, like any email received(mail logs), any user visited the domain(proxy logs)
  6. You can run such checks on hourly, daily basis.... thats it.

Here are coule of examples.

  1. Basic example

    from fuzzywuzzy import fuzz
    a = "microsoft.com"
    b = "micros0ft.com"
    print("Match ratio is ", str(fuzz.ratio(a, b)), "%") // fuzz.ration(a,b) function gives you match score

  2. Working code

    from fuzzywuzzy import fuzz

    dns_data = open(r'/home/user/Desktop/BEC/your_domain.txt','r') # List of genuine domains owned by you
    output = open(r'/home/user/Desktop/BEC/output.txt','w') # Output file

    for dns in dns_data:

    domain = open(r'/home/user/Desktop/BEC/domain-names-data.txt','r') # domain data gathered from proxy/dns/mail logs
    for site in domain:

    if ( fuzz.ratio(site.rstrip(),dns.rstrip()) > 80 ):

    output.write("Match ratio is: \t" + dns.rstrip() + "\t" + site.rstrip() + "\t" + str(fuzz.ratio(site.rstrip(),dns.rstrip())))
    output.write("\n")

If you have access to whois database then you can run this code against newly registered domain everyday and probably you can get the result early!!!

I have run this code against newly registered doamin on 3rd Nov. Legit domains considered are top 1000 domains. Results are amazing as to how many similar looking domains are registered everyday and no wonder we receive lot of offerers from amzon apples :) Check out Output.txt file

Feel free to share your thoughts!!!

Audio Steganography is a technique used to transmit hidden information by modifying an audio signal in an imperceptible manner.

Audio Steganography Audio Steganography is a technique used to transmit hidden information by modifying an audio signal in an imperceptible manner. Ab

Karan Yuvraj Singh 1 Oct 17, 2021
Standard implementations of FedLab and its provided benchmarks.

FedLab-benchmarks This repo contains standard implementations of FedLab and its provided benchmarks. Currently, following algorithms or benchrmarks ar

SMILELab-FL 104 Dec 05, 2022
Application for easy configuration of swap file and swappiness priority in slackware and others linux distributions.

Swap File Program created with the objective of assisting in the configuration of swap file in Distributions such as Slackware. Required packages: pyt

Mauricio Ferrari 3 Aug 06, 2022
Automatic generator of readmes for git repositories (Includes file' listing)

Readme Generator We are bored of write the same things once and once again. We trust in the comments made inside of our files, and we decided to autom

Natalia Vera Duran 6 Jul 20, 2021
This two python programs can convert km to miles and miles to km

km-to-miles These two little python programs can convert kilometers to miles and miles to kilometers Needed Python3 or a online python compiler with t

Chandula Janith 3 Jan 30, 2022
A plugin to simplify creating multi-page Dash apps

Multi-Page Dash App Plugin A plugin to simplify creating multi-page Dash apps. This is a preview of functionality that will of Dash 2.1. Background Th

Plotly 19 Dec 09, 2022
Make some improvements in the Pizza class and pizzashop file by refactoring.

Make some improvements in the Pizza class and pizzashop file by refactoring.

James Brucker 1 Oct 18, 2021
Dynamic key remapper for Wayland Window System, especially for Sway

wayremap Dynamic keyboard remapper for Wayland. It works on both X Window Manager and Wayland, but focused on Wayland as it intercepts evdev input and

Kay Gosho 50 Nov 29, 2022
Search, generate & deliver Msfvenom payloads in an quick and easy way

Goal Search, generate & deliver payloads in an quick and easy way Be as simple as possible BUT with all msfvenom payloads. Ever lost time searching th

2 Mar 03, 2022
async parser for JET

This project is mainly aims to provide an async parsing option for NTDS.dit database file for obtaining user secrets.

15 Mar 08, 2022
BOLT12 Lightning Address Format

BOLT12 Address Support (DRAFT!) Inspired by the awesome lightningaddress.com, except for BOLT12: Supports BOLT12 Allows BOLT12 vendor string authentic

Rusty Russell 28 Sep 14, 2022
NetConfParser is a tool that helps you analyze the rpcs coming and going from a netconf client to a server

NetConfParser is a tool that helps you analyze the rpcs coming and going from a netconf client to a server

Aero 1 Mar 31, 2022
A dictionary that can be flattened and re-inflated

deflatable-dict A dictionary that can be flattened and re-inflated. Particularly useful if you're interacting with yaml, for example. Installation wit

Lucas Sargent 2 Oct 18, 2021
A way to write regex with objects instead of strings.

Py Idiomatic Regex (AKA iregex) Documentation Available Here An easier way to write regex in Python using OOP instead of strings. Makes the code much

Ryan Peach 18 Nov 15, 2021
A thing to simplify listening for PG notifications with asyncpg

asyncpg-listen This library simplifies usage of listen/notify with asyncpg: Handles loss of a connection Simplifies notifications processing from mult

ANNA 18 Dec 23, 2022
A script copies movie and TV files to your GD drive, or create Hard Link in a seperate dir, in Emby-happy struct.

torcp A script copies movie and TV files to your GD drive, or create Hard Link in a seperate dir, in Emby-happy struct. Usage: python3 torcp.py -h Exa

ccf2012 105 Dec 22, 2022
Personal Toolbox Package

Jammy (Jam) A personal toolbox by Qsh.zh. Usage setup For core package, run pip install jammy To access functions in bin git clone https://gitlab.com/

5 Sep 16, 2022
A program to convert celcius to faranheit. made with python

Temp-Converter What is Temp-Converter Temp-Converter is little program made with pyhton to convert celcius to faranheit. Needed A python interpreter P

Chandula Janith 0 Nov 27, 2021
Python program for analyzing the output files of phonopy.

PhononTools Description Python program to analyze the results generated by phonopy. Using the .yaml and .dat files that phonopy generates one can plot

Harry LaBollita 8 Nov 27, 2022
Script for generating Hearthstone card spoilers & checklists

This is a script for generating text spoilers and set checklists for Hearthstone. Installation & Running Python 3.6 or higher is required. Copy/clone

John T. Wodder II 1 Oct 11, 2022