An Amazon Product Scraper built using scapy module of python

Overview

Amazon Product Scraper

This is an Amazon Product Scraper built using scapy module of python

Features

it scrape various things

  • Product Title
  • Product Image
  • Product Price
  • Product Rating
  • Product Description
  • Product Reviews
  • Product Brand
  • Product Colour

By default it scrapes Mobile Phones of 5 Pages from Amazon. In case you want to change it to scrape other product, follow the instructions

  1. Open file /amazon_scraper/spiders/amazon_scraper.py
  2. Chnage the urls list at line 16
  3. Update no_of_pages variable to change number of pages to be scraped

Execute Amazon Scraper

there are two ways to execute scraper

First one

you can directly execute run.sh file using shell

sh ./run.sh

Second one

you can execute the following command

scrapy crawl amazon_scraper -o ./data/data.json

It will create data.json file inside the data folder containing all the scraped data in JSON format and all the images will be saved in data/img/full folder.

Sample Data

Already fetched sample data is available in data folder

Troubleshooting

If data.json file doesn't generate in proper format then just delete data.json file and img folder.
Now you good to go ;)

Preresuisites

  • you have to install scrapy
  • you have to install pillow

[MIT]

Owner
Sudhanshu Jha
Sudhanshu Jha
😈 Discord RAGE is a Python tool that allows you to automatically spam messages in Discord

😈 Discord RAGE Python tool that allows you to automatically spam messages in Discord 🏹 Setup Make sure you have Python installed and PIP is added to

Alphalius 4 Jun 12, 2022
Create custom Vanity URLs for Discord without 30 boosts

CustomVanity - Made by udp#6666 aka Apolo - OpenSource Custom Discord Vanity Creator How To Use Open CustomVanity.py Write your server invite code Wri

apolo 17 Aug 23, 2022
Slash util - A simple script to add application command support to discord.py v2.0

slash_util is a simple wrapper around slash commands for discord.py This is writ

Maya 28 Nov 16, 2022
A telegram bot does not allow channels to send messages to the telegram supergroup

Channel Message Handler Getting started Installation $ git clone https://github.com/AbhijithNT/GroupChannelHandler.git Change directory $ cd ChannelMe

Abhijith N T 0 Dec 26, 2021
Discord Blogger Integration Using Blogger API

It's a very simple discord bot created in python using blogger api in order to search and send your website articles in your discord chat in form of an embedded message. It's pretty useful for people

Owen Singh 8 Oct 28, 2022
Programmeertheorie 2022 - Team Trainspotters - RailNL

Trainspotters Vak: Programmeertheorie 2022 Gekozen case: RailNL Teamnaam: Trainspotters Studenten: Mijntje Meijer, Sam Bijhouwer, Maik Larooij To-do's

Maik Larooij 1 Jan 25, 2022
Based Telegram Bot and Userbot To Play Music in Your Telegram Groups With Some Cool Extra Features! 🥰

CallMusicPlus69 This Repo base on! 🤗️ A CallsMusic Based Telegram Bot and Userbot To Play Music in Your Telegram Groups With Some Cool Extra Features

brut✘⁶⁹ // ユスフ 6 Jun 26, 2022
Cryptocurrency Prices Telegram Bot For Python

Cryptocurrency Prices Telegram Bot How to Run Set your telegram bot token as environment variable TELEGRAM_BOT_TOKEN: export TELEGRAM_BOT_TOKEN=your_

Sina Nazem 3 Oct 31, 2022
NekoRobot-2 - Neko is An Anime themed advance Telegram group management bot.

NekoRobot A modular telegram Python bot running on python3 with an sqlalchemy, mongodb database. ╒═══「 Status 」 Maintained Support Group Included Free

Lovely Boy 19 Nov 12, 2022
Практическая работа 6 - Документирование кода

Практическая работа №6 ПСП – правильная скобочная последовательность – последовательность из открывающих «(« и закрывающих «)» круглых скобок. Програм

0 Apr 14, 2022
A repo to automate the booking process for vaccinations

OntarioVaccineFormAutomaker A repo to automate the booking process for vaccinations Requirements Allow ALL sights to be able to know your location (on

Rafid Dewan 7 May 31, 2021
TeslaPy - A Python implementation based on unofficial documentation of the client side interface to the Tesla Motors Owner API

TeslaPy - A Python implementation based on unofficial documentation of the client side interface to the Tesla Motors Owner API, which provides functiona

Tim Dorssers 233 Dec 30, 2022
StringSessionGenerator - A Telegram bot to generate pyrogram and telethon string session

⭐️ String Session Generator ⭐️ Genrate String Session Using this bot. Made by TeamUltronX 🔥 String Session Demo Bot: Environment Variables Mandatory

TheUltronX 1 Dec 31, 2021
This bot can stream audio or video files and urls in telegram voice chats :)

Voice Chat Streamer This bot can stream audio or video files and urls in telegram voice chats :) 🎯 Follow me and star this repo for more telegram bot

Anjana Madu 63 Dec 25, 2022
Download archived malware from ActiveState's source code mirror

malware-archivist (ma) Tool to aid security researchers in dissecting malware. Often, repository maintainers will remove malicious packages entirely f

ActiveState Software 28 Dec 12, 2022
Analyzed the data of VISA applicants to build a predictive model to facilitate the process of VISA approvals.

Analyzed the data of Visa applicants, built a predictive model to facilitate the process of visa approvals, and based on important factors that significantly influence the Visa status recommended a s

Jesus 1 Jan 08, 2022
A Python script to create customised Spotify playlists using the JSON, Spotipy Library and Spotify Web API, based on seed tracks in your history.

A Python script to create customised Spotify playlists using the JSON, Spotipy Library and Spotify Web API, based on seed tracks in your history.

Youngseo Park 1 Feb 01, 2022
Set of classes and tools to communicate with a Noso wallet using NosoP

NosoPy Set of classes and tools to communicate with a Noso wallet using NosoP(Noso Protocol). The data that can be retrieved consist of: Node informat

Noso Project 1 Jan 10, 2022
Telegram Link Wayback Bot. This bot archives a web page thrown at itself with wayback Machine (Archive.org).

Telegram Link Wayback Bot. This bot archives a web page thrown at itself with wayback Machine (Archive.org).

Hüzünlü Artemis [HuzunluArtemis] 11 Feb 18, 2022
Automatically pulls specified repository whenever a specified file is pushed. Great for working collaboratively when you need to run something locally.

autopull Simple python tool that allows you to automatically pull from a github repository whenever a file with a specified name is uploaded installat

carreb 0 Sep 27, 2022