Amazon Scraper: A command-line tool for scraping Amazon product data

Overview

Amazon Product Scraper: 2021

Description

A command-line tool for scraping Amazon product data to CSV or JSON format(s).

Requirements

  • Python 3
  • pip3

Installation

Using git clone (you'll need git installed for this):

git clone https://github.com/scrapewalrus/amazon-scraper-python-2021

Or download and extract the zip file of the project manually

You'll also need to install requirements for the project to run. Locate amazon-product-scraper folder via terminal and type pip install -r requirements.txt:

Usage

To launch the Amazon scraper locate the amazon-product-scraper folder via terminal and type python amazon_scraper.py -k "your keyword". This will start the program.

NOTE: you must declare either -k or --keyword before entering your keyword. It's a required argument.

Example:

amazon_scraper.py - the name of a scraper file.

-k or --keyword - required argument to pass before entering your keyword.

-p or --proxies - optional argument to enable proxies. To avoid getting blocked I highly recommend using proxies. I'm using Residential Proxies from Oxylabs. For highest success rate, I suggest Residential Proxies over Datacenter as they're almost impossible to detect and have the smallest footprint. If you decide to use different proxy provider services keep in mind that you'll have to make some minor adjustments in get-proxies.py file.

-j or --json - optional argument for storing extracted data in .json format. Default output format is .csv.

Example of product data #1: JSON

[
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/Mostly-T-Shirt-Womens-Letter-Printed/dp/B07QN2NQ59/ref=sr_1_3?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-3",
        "PRODUCT_NAME": "I'm Mostly Peace Love and Light Funny T-Shirt Womens Graphic Printed Short Sleeve Tops Tee",
        "PRICE": "$21.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "1,637"
    },
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/YITAN-Women-Graphic-Funny-X-Large/dp/B074QMG4D7/ref=sr_1_4?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-4",
        "PRODUCT_NAME": "YITAN Women's Cute Juniors Tops Teen Girl Tee Funny T Shirt",
        "PRICE": "$12.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "12,281"
    },
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/DANVOUY-Womens-V-Neck-Doesnt-Definitely/dp/B07V55ZXVS/ref=sr_1_5?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-5",
        "PRODUCT_NAME": "DANVOUY Womens If My Mouth Doesn't Say It My Face Definitely Will T Shirt",
        "PRICE": "$12.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "6,787"
    }]

Example of product data #2: CSV

amazon-csv-product-data-example

RSS reader client for CLI (Command Line Interface),

rReader is RSS reader client for CLI(Command Line Interface)

Lee JunHaeng 10 Dec 24, 2022
Official AIdea command line tool

AIdea CLI Official AIdea command line tool for https://aidea-web.tw. Installation Make sure you have installed both Python 3 and pip package manager.

AIdea 5 Dec 15, 2021
Pyreadline3 - Windows implementation of the GNU readline library

pyreadline3 The pyreadline3 package is based on the stale package pyreadline loc

32 Jan 06, 2023
Display Images in your terminal with python

Term-Img Display Images in your terminal with python NOTE: This project is a work in progress and not everything on here has actually been implemented

My avatar ;D 118 Jan 05, 2023
A simple file transfer tools, similar to rz / sz but compatible with tmux (control mode), which works with iTerm2 and has a nice progress bar

trzsz A simple file transfer tools, similar to rz/sz but compatible with tmux (control mode), which works with iTerm2 and has a nice progress bar. Why

561 Jan 05, 2023
Declarative CLIs with argparse and dataclasses

argparse_dataclass Declarative CLIs with argparse and dataclasses. Features Features marked with a ✓ are currently implemented; features marked with a

Mike DePalatis 29 Dec 06, 2022
Open-Source Python CLI package for copying DynamoDB tables and items in parallel batch processing + query natural & Global Secondary Indexes (GSIs)

Python Command-Line Interface Package to copy Dynamodb data in parallel batch processing + query natural & Global Secondary Indexes (GSIs).

1 Oct 31, 2021
Task-manager-CLI with Priority Modification

Task-manager-CLI with Priority Modification The functions for the app have been written in task.py file. 1. Install Node.js This project requires Node

1 Jan 21, 2022
Urial (URI Addition tooL) intelligently updates URIs stored in Finder comments of macOS files

Urial Urial (URI addition tool) is a simple but intelligent command-line tool to add or replace URIs found inside macOS Finder comments. Table of cont

Mike Hucka 3 Sep 14, 2022
Neovim integration for Google Keep, built using gkeepapi

Gkeep.nvim Neovim integration for Google Keep, built using gkeepapi Requirements Neovim 0.5 Python 3.6+ A patched font (optional. Used for icons) Tabl

Steven Arcangeli 143 Jan 02, 2023
A very simple and lightweight ToDo app using python that can be used from the command line

A very simple and lightweight ToDo app using python that can be used from the command line

Nilesh Sengupta 2 Jul 20, 2022
CLI tool to build, test, debug, and deploy Serverless applications using AWS SAM

AWS SAM The AWS Serverless Application Model (SAM) is an open-source framework for building serverless applications. It provides shorthand syntax to e

Amazon Web Services 6.2k Jan 08, 2023
A simple weather tool. I made this as a way for me to learn Python, API, and PyPi packaging.

A simple weather tool. I made this as a way for me to learn Python, API, and PyPi packaging.

Clint E. 105 Dec 31, 2022
Colors in Terminal - Python Lang

🎨 Colorate - Python 🎨 About Colorate is an Open Source project that makes it easy to use Python color coding in your projects. After downloading the

0110 Henrique 1 Dec 01, 2021
You'll never want to use cd again.

Jmp Description Have you ever used the cd command? You'll never touch that outdated thing again when you try jmp. Navigate your filesystem with unprec

Grant Holmes 21 Nov 03, 2022
Free and Open-Source Command Line tool for Text Replacement

Sniplet Free and Open Source Text Replacement Tool Description: Sniplet is a work in progress CLI tool which can do text replacement globally in Linux

Veeraraghavan Narasimhan 13 Nov 28, 2022
dbt-subdocs is a python CLI you can used to generate a dbt-docs for a subset of your dbt project

dbt-subdocs dbt-subdocs is a python CLI you can used to generate a dbt-docs for a subset of your dbt project 🤔 Description This project is useful if

Jambe 6 Jan 03, 2023
Very nice SMS & Mail Bomber for Termux and Linux.

Very nice SMS & Mail Bomber for Termux and Linux. Coded with love)))

nordbearbot.dev 5 Nov 06, 2022
Todo list console based application. Todo's save to a seperate file.

Todo list console based application. Todo's save to a seperate file.

1 Dec 24, 2021
A CLI messenger for the Signum community.

A CLI messenger for the Signum community. Built for people who like using terminal for their work and want to communicate with other users in the Signum community.

Jush 5 Mar 18, 2022