Amazon Scraper: A command-line tool for scraping Amazon product data

Overview

Amazon Product Scraper: 2021

Description

A command-line tool for scraping Amazon product data to CSV or JSON format(s).

Requirements

  • Python 3
  • pip3

Installation

Using git clone (you'll need git installed for this):

git clone https://github.com/scrapewalrus/amazon-scraper-python-2021

Or download and extract the zip file of the project manually

You'll also need to install requirements for the project to run. Locate amazon-product-scraper folder via terminal and type pip install -r requirements.txt:

Usage

To launch the Amazon scraper locate the amazon-product-scraper folder via terminal and type python amazon_scraper.py -k "your keyword". This will start the program.

NOTE: you must declare either -k or --keyword before entering your keyword. It's a required argument.

Example:

amazon_scraper.py - the name of a scraper file.

-k or --keyword - required argument to pass before entering your keyword.

-p or --proxies - optional argument to enable proxies. To avoid getting blocked I highly recommend using proxies. I'm using Residential Proxies from Oxylabs. For highest success rate, I suggest Residential Proxies over Datacenter as they're almost impossible to detect and have the smallest footprint. If you decide to use different proxy provider services keep in mind that you'll have to make some minor adjustments in get-proxies.py file.

-j or --json - optional argument for storing extracted data in .json format. Default output format is .csv.

Example of product data #1: JSON

[
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/Mostly-T-Shirt-Womens-Letter-Printed/dp/B07QN2NQ59/ref=sr_1_3?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-3",
        "PRODUCT_NAME": "I'm Mostly Peace Love and Light Funny T-Shirt Womens Graphic Printed Short Sleeve Tops Tee",
        "PRICE": "$21.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "1,637"
    },
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/YITAN-Women-Graphic-Funny-X-Large/dp/B074QMG4D7/ref=sr_1_4?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-4",
        "PRODUCT_NAME": "YITAN Women's Cute Juniors Tops Teen Girl Tee Funny T Shirt",
        "PRICE": "$12.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "12,281"
    },
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/DANVOUY-Womens-V-Neck-Doesnt-Definitely/dp/B07V55ZXVS/ref=sr_1_5?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-5",
        "PRODUCT_NAME": "DANVOUY Womens If My Mouth Doesn't Say It My Face Definitely Will T Shirt",
        "PRICE": "$12.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "6,787"
    }]

Example of product data #2: CSV

amazon-csv-product-data-example

Todo list console based application. Todo's save to a seperate file.

Todo list console based application. Todo's save to a seperate file.

1 Dec 24, 2021
A set of libraries and functions for simplifying automating Cisco devices through SecureCRT.

This is a set of libraries for automating Cisco devices (and to a lesser extent, bash prompts) over ssh/telnet in SecureCRT.

Matthew Spangler 7 Mar 30, 2022
CLI based diff viewer

Rich Diff CLI based diff viewer

Suresh Kumar 24 Nov 15, 2022
A command line connect 4 game against a minimax agent.

A command line connect 4 game against a minimax agent.

1 Oct 17, 2021
MiShell is a multi-platform, multi-architecture project based on the first version (MiShell32)

MiShell is a multi-platform, multi-architecture project based on the first version (MiShell32), which offers super super small reverse shell payloads great for injection in buffer overflow vulnerabil

Kamyar Hatamnezhad 0 Oct 27, 2022
Python command line tool and python engine to label table fields and fields in data files.

Python command line tool and python engine to label table fields and fields in data files. It could help to find meaningful data in your tables and data files or to find Personal identifable informat

APICrafter 22 Dec 05, 2022
Doing set operations on files considered as sets of lines

CLI tool that can be used to do set operations like union on files considering them as a set of lines. Notes It ignores all empty lines with whitespac

Partho 11 Sep 06, 2022
Terminal with builtin ortholinear keyboard and touch screen as a home automation interface.

OLKB-Terminal Terminal with builtin ortholinear keyboard and touch screen as a home automation interface. Features Step and STLs available for non-com

Jeff Eberl 50 Oct 07, 2022
Bad Apple printed out on the console with Python!

Bad Apple printed out on the console with Python!

CalvinLoke 186 Dec 01, 2022
CLI client for RFC 4226's HOTP and RFC 6238's TOTP.

One Time Password (OTP, TOTP/HOTP) OTP serves as additional protection in case of password leaks. onetimepass allows you to manage OTP codes and gener

Apptension 4 Jan 05, 2022
instant coding answers via the command line

howdoi instant coding answers via the command line Sherlock, your neighborhood command-line sloth sleuth. Are you a hack programmer? Do you find yours

Benjamin Gleitzman 9.8k Jan 08, 2023
Easily handle day to day CLI operation via Python instead of regular Bash programs.

pz Ever wished to use Python in Bash? Would you choose the Python syntax over sed, awk, ...? Should you exactly know what command would you use in Pyt

CZ.NIC 697 Jan 03, 2023
Sebuah tools agar tydak menjadi sider :v vrohh

Sebuah tools agar tydak menjadi sider :v vrohh

xN7-SEVEN 1 Mar 27, 2022
Patool is a portable command line archive file manager

Patool Patool is an archive file manager. Various archive formats can be created, extracted, tested, listed, searched, repacked and compared with pato

318 Jan 04, 2023
Command line client for Audience Insights

Dynamics 365 Audience Insights CLI The AuI CLI is a command line tool for Dynamics 365 Audience Insights. It is based on the customerinsights Python l

Microsoft 8 Jan 09, 2023
Pyrdle - Play Wordle in the CLI. Write an algorithm to play Wordle for you. Ruin all of the fun you've been having

Pyrdle - Play Wordle in the CLI. Write an algorithm to play Wordle for you. Ruin all of the fun you've been having

Charles Tapley Hoyt 11 Feb 11, 2022
🌈 Generate color palettes based on Neovim colorschemes.

Iris Iris is a Neovim plugin that generates a normalized color palette based on your colorscheme. It is named for the goddess Iris of Greek mythology,

N. G. Scheurich 45 Jul 28, 2022
A library for creating text-based graphs in the terminal

tplot is a Python package for creating text-based graphs. Useful for visualizing data to the terminal or log files.

Jeroen Delcour 164 Dec 14, 2022
A minimal and ridiculously good looking command-line-interface toolkit.

Pyceo Pyceo is a Python package for creating beautiful, composable, and ridiculously good looking command-line-user-interfaces without having to write

Juan-Pablo Scaletti 21 Mar 25, 2022
MsfMania is a command line tool developed in Python that is designed to bypass antivirus software on Windows and Linux/Mac in the future

MsfMania MsfMania is a command line tool developed in Python that is designed to bypass antivirus software on Windows and Linux/Mac in the future. Sum

446 Dec 21, 2022