Searches through git repositories for high entropy strings and secrets, digging deep into commit history

Overview

truffleHog

codecov

Searches through git repositories for secrets, digging deep into commit history and branches. This is effective at finding secrets accidentally committed.

Join The Slack

Have questions? Feedback? Jump in slack and hang out with me

https://join.slack.com/t/trufflehog-community/shared_invite/zt-pw2qbi43-Aa86hkiimstfdKH9UCpPzQ

NEW

truffleHog previously functioned by running entropy checks on git diffs. This functionality still exists, but high signal regex checks have been added, and the ability to suppress entropy checking has also been added.

truffleHog --regex --entropy=False https://github.com/dxa4481/truffleHog.git

or

truffleHog file:///user/dxa4481/codeprojects/truffleHog/

With the --include_paths and --exclude_paths options, it is also possible to limit scanning to a subset of objects in the Git history by defining regular expressions (one per line) in a file to match the targeted object paths. To illustrate, see the example include and exclude files below:

include-patterns.txt:

src/
# lines beginning with "#" are treated as comments and are ignored
gradle/
# regexes must match the entire path, but can use python's regex syntax for
# case-insensitive matching and other advanced options
(?i).*\.(properties|conf|ini|txt|y(a)?ml)$
(.*/)?id_[rd]sa$

exclude-patterns.txt:

(.*/)?\.classpath$
.*\.jmx$
(.*/)?test/(.*/)?resources/

These filter files could then be applied by:

trufflehog --include_paths include-patterns.txt --exclude_paths exclude-patterns.txt file://path/to/my/repo.git

With these filters, issues found in files in the root-level src directory would be reported, unless they had the .classpath or .jmx extension, or if they were found in the src/test/dev/resources/ directory, for example. Additional usage information is provided when calling trufflehog with the -h or --help options.

These features help cut down on noise, and makes the tool easier to shove into a devops pipeline.

Example

Install

pip install truffleHog

Customizing

Custom regexes can be added with the following flag --rules /path/to/rules. This should be a json file of the following format:

{
    "RSA private key": "-----BEGIN EC PRIVATE KEY-----"
}

Things like subdomain enumeration, s3 bucket detection, and other useful regexes highly custom to the situation can be added.

Feel free to also contribute high signal regexes upstream that you think will benefit the community. Things like Azure keys, Twilio keys, Google Compute keys, are welcome, provided a high signal regex can be constructed.

trufflehog's base rule set sources from https://github.com/dxa4481/truffleHogRegexes/blob/master/truffleHogRegexes/regexes.json

To explicitly allow particular secrets (e.g. self-signed keys used only for local testing) you can provide an allow list --allow /path/to/allow in the following format:

{
    "local self signed test key": "-----BEGIN EC PRIVATE KEY-----\nfoobar123\n-----END EC PRIVATE KEY-----",
    "git cherry pick SHAs": "regex:Cherry picked from .*",
}

Note that values beginning with regex: will be used as regular expressions. Values without this will be literal, with some automatic conversions (e.g. flexible newlines).

How it works

This module will go through the entire commit history of each branch, and check each diff from each commit, and check for secrets. This is both by regex and by entropy. For entropy checks, truffleHog will evaluate the shannon entropy for both the base64 char set and hexidecimal char set for every blob of text greater than 20 characters comprised of those character sets in each diff. If at any point a high entropy string >20 characters is detected, it will print to the screen.

Help

usage: trufflehog [-h] [--json] [--regex] [--rules RULES] [--allow ALLOW]
                  [--entropy DO_ENTROPY] [--since_commit SINCE_COMMIT]
                  [--max_depth MAX_DEPTH]
                  git_url

Find secrets hidden in the depths of git.

positional arguments:
  git_url               URL for secret searching

optional arguments:
  -h, --help            show this help message and exit
  --json                Output in JSON
  --regex               Enable high signal regex checks
  --rules RULES         Ignore default regexes and source from json list file
  --allow ALLOW         Explicitly allow regexes from json list file
  --entropy DO_ENTROPY  Enable entropy checks
  --since_commit SINCE_COMMIT
                        Only scan from a given commit hash
  --branch BRANCH       Scans only the selected branch
  --max_depth MAX_DEPTH
                        The max commit depth to go back when searching for
                        secrets
  -i INCLUDE_PATHS_FILE, --include_paths INCLUDE_PATHS_FILE
                        File with regular expressions (one per line), at least
                        one of which must match a Git object path in order for
                        it to be scanned; lines starting with "#" are treated
                        as comments and are ignored. If empty or not provided
                        (default), all Git object paths are included unless
                        otherwise excluded via the --exclude_paths option.
  -x EXCLUDE_PATHS_FILE, --exclude_paths EXCLUDE_PATHS_FILE
                        File with regular expressions (one per line), none of
                        which may match a Git object path in order for it to
                        be scanned; lines starting with "#" are treated as
                        comments and are ignored. If empty or not provided
                        (default), no Git object paths are excluded unless
                        effectively excluded via the --include_paths option.

Running with Docker

First, enter the directory containing the git repository

cd /path/to/git

To launch the trufflehog with the docker image, run the following"

docker run --rm -v "$(pwd):/proj" dxa4481/trufflehog file:///proj

-v mounts the current working dir (pwd) to the /proj dir in the Docker container

file:///proj references that very same /proj dir in the container (which is also set as the default working dir in the Dockerfile)

Wishlist

  • A way to detect and not scan binary diffs
  • Don't rescan diffs if already looked at in another branch
  • A since commit X feature
  • Print the file affected
Comments
  • fix #8 - add `--include` and `--exclude` options

    fix #8 - add `--include` and `--exclude` options

    Fixes issue #8 by adding --include_paths and --exclude_paths options that allow the user to limit scanning to a subset of objects in the Git history by defining regular expressions (one per line) in a file to match the targeted object paths.

    If provided, the --include_paths option should point to a file with regular expressions (one per line), at least one of which must match a Git object path in order for it to be scanned. If empty or not provided (default), all Git object paths are included (unless otherwise excluded via the --exclude_paths option).

    Likewise, the --exclude_paths option, when provided, should point to a file with regular expressions, none of which may match a Git object path in order for it to be scanned. If empty or not provided (default), no Git object paths are excluded (unless effectively excluded via the --include_paths option).

    In either file, lines starting with "#" are treated as comments and are ignored.

    opened by milo-minderbinder 22
  • fix --since_commit parameter

    fix --since_commit parameter

    Hi, how can I contribute to this project? I was running truffleHog and using the --since_commit parameter, however it was buggy and did not work as expected. I made a very small change, and it worked as expected. Do you accept PRs or should I just tell you the change so you can verify it?

    opened by fahrishb 18
  • The regex functionality is not working as expected

    The regex functionality is not working as expected

    I git cloned the truffleHog repository. Changed my regexChecks.py file to look like below:

    import re
    
    regexes = {
        "Slack Token XOXP": re.compile('xoxp.*'),
        "Slack Token XOXB": re.compile('xoxb.*'),
        "Slack Token XOXO": re.compile('xoxo.*'),
        "Slack Token XOXA": re.compile('xoxa.*'),
        "AWS API Key": re.compile('AKIA.*'),
        "Private key": re.compile('-----BEGIN PRIVATE KEY-----.*')
    }
    

    I then installed the libraries required to run the tool by typing pip install -r requirements.txt. My requirements.txt file looked like below:

    GitPython==2.1.5
    gitdb2==2.0.2
    smmap2==2.0.2
    

    Finally, I ran the tool by typing - python truffleHog.py --regex --entropy=False https://github.com/secretuser1/secretrepo.git

    It printed out the Private Key, Slack Token XOXP and Slack Token XOXB. It should have also printed out the AWS key here - https://github.com/secretuser1/secretrepo/blob/master/secretfile.txt#L2 but it did not, even though the regex is present.

    Any idea why?

    opened by anshumanbh 14
  • Adding the capability for scanning a directory

    Adding the capability for scanning a directory

    This PR adds the capability for truffleHog to recursively scan a directory instead of a Git repository with all its history. This can be useful in CI pipelines or other situations where it is desirable to scan the codebase at a single point in time. Additionally, it can also be used to scan code that is not stored in Git.

    I've done some minor refactoring to the existing scanning code to reduce code duplication.

    opened by runako 14
  • ValueError: unknown reasons (During run application on EC2 RedHat)

    ValueError: unknown reasons (During run application on EC2 RedHat)

    If anyone can help I will be appreciate! Describe the bug Having an error during running the app on EC2 RedHat : ValueError: unknown reasons

    I installed on redhat ec2 instance trufflehog. By default there python 2.7 and 3.6 Trufflehog was installed from pip3. pip3 freeze shows that everything installed : gitdb==4.0.5 gitdb2==4.0.2 GitPython==3.0.6 smmap==3.0.5 truffleHog==2.2.1 truffleHogRegexes==0.0.7

    After installation and running next command ( just to check does it work or not) trufflehog --regex --entropy=False https://github.com/dxa4481/truffleHog.git ( I got next error) : Traceback (most recent call last): File "/usr/local/bin/trufflehog", line 11, in sys.exit(main()) File "/usr/local/lib64/python3.6/site-packages/truffleHog/truffleHog.py", line 93, in main surpress_output=False, branch=args.branch, repo_path=args.repo_path, path_inclusions=path_inclusions, path_exclusions=path_exclusions, allow=allow) File "/usr/local/lib64/python3.6/site-packages/truffleHog/truffleHog.py", line 351, in find_strings diff_hash = hashlib.md5((str(prev_commit) + str(curr_commit)).encode('utf-8')).digest() ValueError: unknown reasons

    opened by RepositoryOfCode 12
  •  Git issue when trying to scan cloned project in Apple-mac: trufflehog file:///

    Git issue when trying to scan cloned project in Apple-mac: trufflehog file:///

    Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.6/bin/trufflehog", line 10, in sys.exit(main()) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/truffleHog/truffleHog.py", line 82, in main surpress_output=False, branch=args.branch, repo_path=args.repo_path, path_inclusions=path_inclusions, path_exclusions=path_exclusions) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/truffleHog/truffleHog.py", line 309, in find_strings project_path = clone_git_repo(git_url) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/truffleHog/truffleHog.py", line 152, in clone_git_repo Repo.clone_from(git_url, project_path) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/git/repo/base.py", line 925, in clone_from return cls._clone(git, url, to_path, GitCmdObjectDB, progress, **kwargs) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/git/repo/base.py", line 880, in _clone finalize_process(proc, stderr=stderr) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/git/util.py", line 341, in finalize_process proc.wait(**kwargs) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/git/cmd.py", line 291, in wait raise GitCommandError(self.args, status, errstr) git.exc.GitCommandError: Cmd('git') failed due to: exit code(128) cmdline: git clone -v file:///GitHub/Github-guardian/ /var/folders/37/md_m401d073bw1vt7q863pbw0000gn/T/tmpchgjs_ln stderr: 'Cloning into '/var/folders/37/md_m401d073bw1vt7q863pbw0000gn/T/tmpchgjs_ln'... fatal: '/GitHub/Github-guardian/' does not appear to be a git repository fatal: Could not read from remote repository.

    Please make sure you have the correct access rights and the repository exists. '

    opened by dgurazada 12
  • Depth limits are needed to prevent long jobs

    Depth limits are needed to prevent long jobs

    When leveraging trufflehog for repo scans, it would be helpful to introduce the concept of depth limits, to ensure that when a scan is performed, it only goes to a certain number of commits back. On a test repository that I have, there is a huge number of commits dating back to 2014, and the job is running well more than 24 hours to go deep across all of them.

    opened by dend 11
  • WindowsError: [Error 5] Access is denied

    WindowsError: [Error 5] Access is denied

    Traceback (most recent call last): File "trufflehog.py", line 106, in <module> find_strings(args.git_url) File "trufflehog.py", line 98, in find_strings shutil.rmtree(project_path) File "C:\Python27\lib\shutil.py", line 247, in rmtree rmtree(fullname, ignore_errors, onerror) File "C:\Python27\lib\shutil.py", line 247, in rmtree rmtree(fullname, ignore_errors, onerror) File "C:\Python27\lib\shutil.py", line 247, in rmtree rmtree(fullname, ignore_errors, onerror) File "C:\Python27\lib\shutil.py", line 252, in rmtree onerror(os.remove, fullname, sys.exc_info()) File "C:\Python27\lib\shutil.py", line 250, in rmtree os.remove(fullname) WindowsError: [Error 5] Access is denied: 'temp\\[uuid]\\.git\\objects\\pack\\pack-[uuid].idx'

    When scanning some repos. (This one crashes half way through, This one crashes at startup)

    opened by Peter-Maguire 11
  • i cant see the result

    i cant see the result

    1. See error

    {"level":"debug","msg":"Cloning remote Git repo without authentication","time":"2022-04-05T16:19:28Z"} {"level":"debug","msg":"Git repo local path: /tmp/trufflehog944564607","time":"2022-04-05T16:23:19Z"}

    2022/04/05 16:44:16 [updater parent] prog exited with 1

    I can see the result if found or note even if I use --json I cant see the saved file its always clone the repo in tmp folder after finish scanning it should delete the cloned folder in the tmp

    bug 
    opened by abramas 10
  • Hardcoded thresholds of 20 in get_strings_of_set()

    Hardcoded thresholds of 20 in get_strings_of_set()

    threshold keyword variable is declared and used on the last if statement Line 39, but not in the first else statement Line 35

    def get_strings_of_set(word, char_set, threshold=20):
        count = 0
        letters = ""
        strings = []
        for char in word:
            if char in char_set:
                letters += char
                count += 1
            else:
                if count > 20:
                    strings.append(letters)
                letters = ""
                count = 0
        if count > threshold:
            strings.append(letters)
    
    opened by bandrel 10
  • gitdb update breaks trufflehog

    gitdb update breaks trufflehog

    Probably related to #198

    We install inside a docker container using:

    $ pip install truffleHog==2.0.99
    

    We run:

    $ trufflehog --regex --entropy=False .
    

    Starting today this errored with:

    Traceback (most recent call last):
       File "/usr/local/bin/trufflehog", line 5, in <module>
         from truffleHog.truffleHog import main
       File "/usr/local/lib/python3.8/site-packages/truffleHog/truffleHog.py", line 17, in <module>
         from git import Repo
       File "/usr/local/lib/python3.8/site-packages/git/__init__.py", line 38, in <module>
         from git.config import GitConfigParser  # @NoMove @IgnorePep8
       File "/usr/local/lib/python3.8/site-packages/git/config.py", line 16, in <module>
         from git.compat import (
       File "/usr/local/lib/python3.8/site-packages/git/compat.py", line 16, in <module>
         from gitdb.utils.compat import (
     ModuleNotFoundError: No module named 'gitdb.utils.compat'
    

    A quick dive down the dependency tree showed that the trufflehog dependency on gitpython-2.1.1 (here) is pulling in gitdb2-3.0.2 (here) which has removed the gitdb.utils.compat (PR)

    Our fix for now is to use (may be useful to others):

    pip install gitdb2==3.0.0 truffleHog==2.0.99
    
    opened by danieldooley 9
  • Use access-token endpoint for validity check

    Use access-token endpoint for validity check

    This PR fixes the issue https://github.com/trufflesecurity/trufflehog/issues/990, it should correctly report keys as valid even if they are missing the user_read scope.

    opened by clonsdale-canva 1
  • Buildkite token validation missing tokens without user_read scope

    Buildkite token validation missing tokens without user_read scope

    Community Note

    • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
    • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
    • If you are interested in working on this issue or have submitted a pull request, please leave a comment

    TruffleHog Version

    3.21.0

    Expected Behavior

    Buildkite token is reported as valid

    Actual Behavior

    Buildkite token is not validated as the API call fails due to missing user_read scope

    Additional Context

    The logic to check if a buildkite token is valid will send out an API call to the /user endpoint https://github.com/trufflesecurity/trufflehog/blob/009756dce61948a66cf90a8b14018460c91ab4f0/pkg/detectors/buildkite/buildkite.go#L51. This will miss all tokens which do not have the read_user scope.

    Instead, we can use the access-token endpoint, which will return 200 for any valid token, and report on the scopes present / ID of the token - https://buildkite.com/docs/apis/rest-api/access-token.

    bug 
    opened by clonsdale-canva 0
  • Add max-depth limit to GitHub subcommand

    Add max-depth limit to GitHub subcommand

    Community Note

    • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
    • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
    • If you are interested in working on this issue or have submitted a pull request, please leave a comment

    Description

    Ability to limit the depth of the commit history being scanned for GitHub users We need the ability to set a --max-depth= limit to GitHub sub command.

    Problem to be Addressed

    It is very noisy for large GitHub enterprises to detect new issues due to the inability to ignore historical commit history that one has already remediated. Results have to be saved into a spreadsheet or database and then diff'd to see what has changed.

    Description of the Preferred Solution

    The ability to set a --max-depth= limit to GitHub sub command. This would be very beneficial when attempting to scan a GitHub enterprise repositories as a group.

    Additional Context

    References

    • #0000
    enhancement 
    opened by dwilliamsstc 0
  • go install - missing dot in first path element

    go install - missing dot in first path element

    build github.com/trufflesecurity/trufflehog/v3: cannot load embed: malformed module path "embed": missing dot in first path element

    Community Note

    • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
    • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
    • If you are interested in working on this issue or have submitted a pull request, please leave a comment

    TruffleHog Version

    Trace Output

    Expected Behavior

    Actual Behavior

    Steps to Reproduce

    Distributor ID: Elementary Description: elementary OS 6.1 Jólnir Release: 6.1 Codename: jolnir

    Additional Context

    References

    • #0000
    bug 
    opened by rip752 0
  • Run certain Detector Type

    Run certain Detector Type

    trufflehog version: trufflehog dev

    Currently I am running trufflehog as a pre-commit hook with all possible Detector type. Is it possible to only run few Detector types , say AWS keys, Private keys as such?

    bug 
    opened by Priyadhana 0
Releases(v3.21.0)
Simple python script for generating custom high-secure passwords for securing your social-apps ❤️

Opensource Project Simple Python Password Generator This repository is just for peoples who want to generate strong-passwords for there social-account

K A R T H I K 15 Dec 01, 2022
SonicWALL SSL-VPN Web Server Vulnerable Exploit

SonicWALL SSL-VPN Web Server Vulnerable Exploit

44 Nov 15, 2022
Confluence Server Webwork OGNL injection

CVE-2021-26084 - Confluence Server Webwork OGNL injection An OGNL injection vulnerability exists that would allow an authenticated user and in some in

Fellipe Oliveira 295 Jan 06, 2023
Scout Suite - an open source multi-cloud security-auditing tool,

Description Scout Suite is an open source multi-cloud security-auditing tool, which enables security posture assessment of cloud environments. Using t

NCC Group Plc 5k Jan 05, 2023
AnonStress-Stored-XSS-Exploit - An exploit and demonstration on how to exploit a Stored XSS vulnerability in anonstress

AnonStress Stored XSS Exploit An exploit and demonstration on how to exploit a S

صلى الله على محمد وآله 3 Jun 22, 2022
Raphael is a vulnerability scanning tool based on Python3.

Raphael Raphael是一款基于Python3开发的插件式漏洞扫描工具。 Raphael is a vulnerability scanning too

b4zinga 5 Mar 21, 2022
S2-062 (CVE-2021-31805) / S2-061 / S2-059 RCE

CVE-2021-31805 Remote code execution S2-062 (CVE-2021-31805) Due to Apache Struts2's incomplete fix for S2-061 (CVE-2020-17530), some tag attributes c

warin9 31 Nov 22, 2022
Exploiting CVE-2021-42278 and CVE-2021-42287 to impersonate DA from standard domain user

Exploiting CVE-2021-42278 and CVE-2021-42287 to impersonate DA from standard domain user Known issues it will not work outside kali , i will update it

Hossam 867 Dec 22, 2022
A Feature Rich Modular Malware Configuration Extraction Utility for MalDuck

Malware Configuration Extractor A Malware Configuration Extraction Tool and Modules for MalDuck This project is FREE as in FREE 🍺 , use it commercial

c3rb3ru5 103 Dec 18, 2022
Generate MIPS reverse shell shellcodes easily !

MIPS-Reverse MIPS-Reverse is a tool that can generate shellcodes for the MIPS architecture that launches a reverse shell where you can specify the IP

29 Jul 27, 2021
Downloads SEP, Baseband and BuildManifest automatically for signed iOS version's for connected iDevice

FutureHelper Supports macOS and Windows Downloads SEP, Baseband and BuildManifest automatically for signed iOS version's (including beta firmwares) fo

Kasim Hussain 7 Jan 05, 2023
Exploit for CVE-2021-3129

laravel-exploits Exploit for CVE-2021-3129

Ambionics Security 228 Nov 25, 2022
NS-Defacer: a auto html injecter, In other words It's a auto defacer to deface a lot of websites in less time

Overview NS-Defacer is a auto html injecter, In other words It's a auto defacer

NightSec 10 Nov 19, 2022
A quick script to spot the usage of Unicode Bidi (bidirectional) characters that could lead to an Invisible Backdoor

Invisible Backdoor Detector is a little Python script that allows you to spot and remove Bidi characters that could lead to an invisible backdoor. If you don't know what that is you should check the

SecSI 28 Dec 29, 2022
An IDA pro python script to decrypt Qbot malware string

Qbot-Strings-Decrypter An IDA pro python script to decrypt Qbot malware strings.

stuckinvim 6 Sep 01, 2022
Script Crack Facebook Elite 🚶‍♂

elite Script Crack Facebook Elite 🚶‍♂ Install Script $ pkg update && pkg upgrade $ termux-setup-storage $ pkg install git $ pkg install python $ pip

Yumasaa 1 Jan 02, 2022
A Python script that can be used to check if a SAP system is affected by CVE-2022-22536

Vulnerability assessment for CVE-2022-22536 This repository contains a Python script that can be used to check if a SAP system is affected by CVE-2022

Onapsis Inc. 42 Dec 01, 2022
A Superfast SMS & Call bomber for Linux And Termux !

A Superfast SMS & Call bomber for Linux And Termux !

Anubhav Kashyap 15 Feb 21, 2022
The Multi-Tool Web Vulnerability Scanner.

🟥 RapidScan v1.2 - The Multi-Tool Web Vulnerability Scanner RapidScan has been ported to Python3 i.e. v1.2. The Python2.7 codebase is available on v1

skavngr 1.3k Dec 31, 2022
APKLeaks - Scanning APK file for URIs, endpoints & secrets.

APKLeaks - Scanning APK file for URIs, endpoints & secrets.

dw1 3.5k Jan 09, 2023