This Python script can enumerate all URLs present in robots.txt files, and test whether they can be accessed or not.

Overview

Robots.txt tester

With this script, you can enumerate all URLs present in robots.txt files, and test whether you can access them or not.

example

Setup

Clone the repository and install the dependencies :

git clone https://github.com/p0dalirius/robotstester
cd robotstester
python3 setup.py install

Usage

robotstester -u http://www.example.com/

You can find here a complete list of options :

[~] Robots.txt tester, v1.2.0

usage: robotstester.py [-h] (-u URL | -f URLSFILE) [-v] [-q] [-k] [-L] [-t THREADS] [-p] [-j JSONFILE] [-x PROXY] [-b COOKIES]

This Python script can enumerate all URLs present in robots.txt files, and test whether they can be accessed or not.

optional arguments:
  -h, --help            show this help message and exit
  -u URL, --url URL     URL to the robots.txt to test e.g. https://example.com:port/path
  -f URLSFILE, --urlsfile URLSFILE
                        List of robots.txt urls to test
  -v, --verbose         verbosity level (-v for verbose, -vv for debug)
  -q, --quiet           Show no information at all
  -k, --insecure        Allow insecure server connections when using SSL (default: False)
  -L, --location        Follow redirects (default: False)
  -t THREADS, --threads THREADS
                        Number of threads (default: 5)
  -p, --parsable        Parsable output
  -j JSONFILE, --jsonfile JSONFILE
                        Save results to specified JSON file.
  -x PROXY, --proxy PROXY
                        Specify a proxy to use for requests (e.g., http://localhost:8080)
  -b COOKIES, --cookies COOKIES
                        Specify cookies to use in requests. (e.g., --cookies "cookie1=blah;cookie2=blah")

Contributing

Pull requests are welcome. Feel free to open an issue if you want to add other features.

You might also like...
Import modules and files straight from URLs.

Import Python code from modules straight from the internet.

A python script made for personal use to monitor for sports card restocks on target.com since they are sold out often

TargetProductMonitor A python script made for personal use to monitor for sports card resocks on target.com since they are sold out often. When a rest

My sister is a GR of her class. She had to mark attendance of students from screenshots of teams meeting on an excel sheet. I resolved her problem by reading names from screenshots using PyTesseract and marking them present on the excel using Pandas in Python. It took me 1hr to write the code and it is saving half an hour everyday.
Manipulation OpenAI Gym environments to simulate robots at the STARS lab

liegroups Python implementation of SO2, SE2, SO3, and SE3 matrix Lie groups using numpy or PyTorch. [Documentation] Installation To install, cd into t

Serverless demo showing users how they can capture (and obfuscate) their Lambda payloads in Datadog APM
Serverless demo showing users how they can capture (and obfuscate) their Lambda payloads in Datadog APM

Serverless-capture-lambda-payload-demo Serverless demo showing users how they can capture (and obfuscate) their Lambda payloads in Datadog APM This wi

A one place destination to check whatever is trending on the top social and news websites at present.
A one place destination to check whatever is trending on the top social and news websites at present.

UpTrend A one place destination to check whatever is trending on the top social and news websites at present. Explore the docs » View Demo · Report Bu

 Python requirements.txt Guesser
Python requirements.txt Guesser

Python-Requirements-Guesser ⚠️ This is alpha quality software. Work in progress Attempt to guess requirements.txt modules versions based on Git histor

Birthday program - A program that lookups a birthday txt file and compares to the current date to check for birthdays
Birthday program - A program that lookups a birthday txt file and compares to the current date to check for birthdays

Birthday Program This is a program that lookups a birthday txt file and compares

Write a program that works out whether if a given year is a leap year
Write a program that works out whether if a given year is a leap year

Leap Year 💪 This is a Difficult Challenge 💪 Instructions Write a program that works out whether if a given year is a leap year. A normal year has 36

Comments
  • [Feature]  Add waybackmachine capability

    [Feature] Add waybackmachine capability

    In the past few days I've been experiencing using waybackmachine to enumerate robots.txt endpoints.

    Sometimes robots.txt gets removed and sometimes the removed content can be juicy. Thus the ideia of searching every WBM to look for old robots entries.

    I've implemented a quick and basic script to do a PoC, but I feel like this repo has the power to bring it to the next level since a lot of good features are already done.

    https://gist.github.com/felipecaon/035ad1718c3cae681d2afb03c699795f

    The gist works by getting all the robots.txt entries from WBM, parsing and sending to stdout. The script does not remove dps, just do a basic word removal.

    If I have the time I may be able to open a PR. But if someone wants to takes it further, I would love to see that. The core waybackmachine endpoints to be used are on my gist file.

    opened by felipecaon 0
Releases(1.2)
  • 1.2(Jul 7, 2021)

    Added --parsable option :cat2:

    usage: robotstester.py [-h] (-u URL | -f URLSFILE) [-v] [-q] [-k] [-L] [-t THREADS] [-p] [-j JSONFILE] [-x PROXY] [-b COOKIES]
    
    This Python script can enumerate all URLs present in robots.txt files, and test whether they can be accessed or not.
    
    optional arguments:
      -h, --help            show this help message and exit
      -u URL, --url URL     URL to the robots.txt to test e.g. https://example.com:port/path
      -f URLSFILE, --urlsfile URLSFILE
                            List of robots.txt urls to test
      -v, --verbose         verbosity level (-v for verbose, -vv for debug)
      -q, --quiet           Show no information at all
      -k, --insecure        Allow insecure server connections when using SSL (default: False)
      -L, --location        Follow redirects (default: False)
      -t THREADS, --threads THREADS
                            Number of threads (default: 5)
      -p, --parsable        Parsable output
      -j JSONFILE, --jsonfile JSONFILE
                            Save results to specified JSON file.
      -x PROXY, --proxy PROXY
                            Specify a proxy to use for requests (e.g., http://localhost:8080)
      -b COOKIES, --cookies COOKIES
                            Specify cookies to use in requests. (e.g., --cookies "cookie1=blah;cookie2=blah")
    
    Source code(tar.gz)
    Source code(zip)
  • 1.0(Jul 5, 2021)

    [~] Robots.txt tester, v1.0
    
    usage: robotstester.py [-h] [-u URL | -f URLSFILE] [-v] [-q] [-k] [-L] [-t THREADS] [-j JSONFILE] [-x PROXY] [-b COOKIES]
    
    This Python script can enumerate all URLs present in robots.txt files, and test whether they can be accessed or not.
    
    optional arguments:
      -h, --help            show this help message and exit
      -u URL, --url URL     URL to the robots.txt to test e.g. https://example.com:port/path
      -f URLSFILE, --urlsfile URLSFILE
                            List of robots.txt urls to test
      -v, --verbose         verbosity level (-v for verbose, -vv for debug)
      -q, --quiet           Show no information at all
      -k, --insecure        Allow insecure server connections when using SSL (default: False)
      -L, --location        Follow redirects (default: False)
      -t THREADS, --threads THREADS
                            Number of threads (default: 5)
      -j JSONFILE, --jsonfile JSONFILE
                            Save results to specified JSON file.
      -x PROXY, --proxy PROXY
                            Specify a proxy to use for requests (e.g., http://localhost:8080)
      -b COOKIES, --cookies COOKIES
                            Specify cookies to use in requests. (e.g., --cookies "cookie1=blah;cookie2=blah")
    
    Source code(tar.gz)
    Source code(zip)
Owner
Podalirius
Hacker of everything
Podalirius
Some basic sorting algos

Sorting-Algos Some basic sorting algos HacktoberFest 2021 This repository consists of mezzo-level projects that undertake a simple task and perform it

Manthan Ghasadiya 7 Dec 13, 2022
A Python software implementation of the Intel 4004 processor

Pyntel4004 A Python software implementation of the Intel 4004 processor. General Information Two pass assembler using the original mnemonics, directiv

alshapton 5 Oct 01, 2022
The Begin button and menu for the Meadows operating system. The start button for UNIX/Linux.

By: Seanpm2001, Meadows Et; Al. Top README.md Read this article in a different language Sorted by: A-Z Sorting options unavailable ( af Afrikaans Afri

Sean P. Myrick V19.1.7.2 4 Aug 28, 2022
A prototype COG-based tile server for sparse Mars datasets

Mars tiler Mars Tiler is a prototype web application that serves tiles from cloud-optimized GeoTIFFs, with an emphasis on supporting planetary dataset

Daven Quinn 3 Mar 23, 2022
A framework to create reusable Dash layout.

dash_component_template A framework to create reusable Dash layout.

The TolTEC Project 4 Aug 04, 2022
Code for the manim-generated scenes used in 3blue1brown videos

This project contains the code used to generate the explanatory math videos found on 3Blue1Brown. This almost entirely consists of scenes generated us

Grant Sanderson 4.1k Jan 02, 2023
A program to generate random numbers b/w 0 to 10 using time

random-num-using-time A program to generate random numbers b/w 0 to 10 using time it uses python's in-built module datetime and an equation which retu

Atul Kushwaha 1 Oct 01, 2022
Bots in moderation and a game (for now)

Tutorial: come far funzionare il bot e durarlo per 24/7 (o quasi...) Ci sono 17 passi per seguire: Andare sul sito Replit https://replit.com/ Vedrete

ZacyKing 1 Dec 27, 2021
BestBuy Script Designed to purchase any item when it becomes available.

prerequisites: Selnium; undetected-chromedriver. This Script is designed to order an Item provided a link from BestBuy.com only.

Bransen Smith 0 Jan 12, 2022
A script to download all the challenges and files from the CTFd instance.

Python CTFd Downloader A script to download all the challenges and files from the CTFd instance. Installation Clone this repo: git clone https://githu

Jacob Elliott 19 Dec 16, 2022
NCAR/UCAR virtual Python Tutorial Seminar Series lesson on MetPy.

The Project Pythia Python Tutorial Seminar Series continues with a lesson on MetPy on Wednesday, 2 February 2022 at 1 PM Mountain Standard Time.

Project Pythia Tutorials 6 Oct 09, 2022
Aides to reduce a cheat file with a personal selection of the cheats you want to use.

Retroarch Cheat File Reducer Description Aides to reduce a cheat file with a personal selection of the cheats you want to use. Instructions Copy a sel

1 Jan 09, 2022
A refresher for PowerBI Desktop documents

PowerBI_Refresher-NPP Informació Per executar el programa s'ha de tenir instalat el python versio 3 o mes. Requeriments a requirements.txt. El fitxer

Nil Pujol 1 May 02, 2022
tox-gh is a tox plugin which helps running tox on GitHub Actions with multiple different Python versions on multiple workers in parallel

tox-gh is a tox plugin which helps running tox on GitHub Actions with multiple different Python versions on multiple workers in parallel. This project is inspired by tox-travis.

tox development team 19 Dec 26, 2022
rebalance is a simple Python 3.9+ library for rebalancing investment portfolios

rebalance rebalance is a simple Python 3.9+ library for rebalancing investment portfolios. It supports cash flow rebalancing with contributions and wi

Darik Harter 5 Feb 26, 2022
Python 100daysofcode

#python #100daysofcode Python is a simple, general purpose ,high level & object-oriented programming language even it's is interpreted scripting langu

Tara 1 Feb 10, 2022
Tool to audit and fix Python project requirements.

Requirement Auditor Utility to revise and updated python requirement files.

Luis Carlos Berrocal 1 Nov 07, 2021
UUID_ApiGenerator - This an API that will return a key-value pair of randomly generated UUID

This an API that will return a key-value pair of randomly generated UUID. Key will be a timestamp and value will be UUID. While the

1 Jan 28, 2022
Web service which feeds Navitia with real-time disruptions

Chaos Chaos is the web service which can feed Navitia with real-time disruptions. It can work together with Kirin which can feed Navitia with real-tim

KISIO Digital 7 Jan 07, 2022
API moment - LussovAPI

LussovAPI TL;DR: py API container, pip install -r requirements.txt, example, main configuration Long version: Install Dependancies Download file requi

William Pedersen 1 Nov 30, 2021