Shopee Scraper - A web scraper in python that extract sales, price, avaliable stock, location and more of a given seller in Brazil

Overview

Shopee Scraper

A web scraper in python that extract sales, price, avaliable stock, location and more of a given seller in Brazil.

The project was created in python 3 and requires only 3 libraries that may need to be installed (in case you don't have any of them).

They are: requests, date and time. Date and Time are default libraries for Linux and Mac users, but if you're running Windows, make sure to install them using pip.

You can easily install requests using the following command: $ pip install requests

The script runs based on Shopee's public API. Shopee generates a dynamic page that shows products and its information calling a json file. Since it's an API and it's public, it's easier to just call the json file and extract the data instead of selecting divs, classes and scrolling through the results and using Selenium to simulate a web browser.

How to use it

  1. The first thing you have to do is to find the seller's id. It's present in the product link.

Exemple: https://shopee.com.br/Camisetas-Bandas-Rock-RHCP-Red-Hot-Chili-Peppers-100-Algodao!!-i.409068735.3983196792

  • 409068735 is the seller's id. That's required to run the script.
  • 3983196792 is the product's id
  1. Before running the code, change the file directory where you want to save the csv file generated what will contain all the data extracted.
  • file=open("/YOUR-DIRECTORY/%s-YOUR-FILE-NAME.csv" % data, "a"))
  • The %s- right before the file name prints the date when the csv was generated. It's recommended to keep it that way, in order to track down your files.
  1. Using the terminal, go to the script's folder and run:
  • python3 shopee-scraper.py
  • Type in the seller's id you just got from the product link.
  • The script will scrape 999 products published and the scraper will take 1 sec. per ad. So it may take some time depending on the number of products.

Why I created this project and who I am?

  • I'm a Computer Engineering and Mathematics major in Brazil. I already got a bachelors degree in Marketing and I'm looking for a Data Engineer and Data Scientist position.
  • Currently working for a small company in Brazil as a comercial manager and my main role is to increse the online sales of hydraulic and brass connectors for gas and petroleum
  • I love data and statistics. Finding new possibilities and ways of doing things better and faster through the data is a facinating thing, and quoting Carl Sagan I would say that "it's a pleasure to share a planet and an epoch with you", because the humankind don't even know yet what we're capable of. AI and machine learning will show us a new world, a new age.
  • I really like the feeling of helping companies to make better data-driven decisions on online sales, marketing and purchasing. Solving problems is pretty much the main motivation of any mathematician or engineer
Owner
Paulo DaRosa
Computer Engineer, Mathematician and Marketer.
Paulo DaRosa
A multithreaded tool for searching and downloading images from popular search engines. It is straightforward to set up and run!

🕳️ CygnusX1 Code by Trong-Dat Ngo. Overviews 🕳️ CygnusX1 is a multithreaded tool 🛠️ , used to search and download images from popular search engine

DatNgo 32 Dec 31, 2022
一些爬虫相关的签名、验证码破解

cracking4crawling 一些爬虫相关的签名、验证码破解,目前已有脚本: 小红书App接口签名(shield)(2020.12.02) 小红书滑块(数美)验证破解(2020.12.02) 海南航空App接口签名(hnairSign)(2020.12.05) 说明: 脚本按目标网站、App命

XNFA 90 Feb 09, 2021
Nekopoi scraper using python3

Features Scrap from url Todo [+] Search by genre [+] Search by query [+] Scrap from homepage Example # Hentai Scraper from nekopoi import Hent

MhankBarBar 9 Apr 06, 2022
Current Antarctic large iceberg positions derived from ASCAT and OSCAT-2

Iceberg Locations Antarctic large iceberg positions derived from ASCAT and OSCAT-2. All data collected here are from the NASA SCP website Overview Thi

Joel Hanson 5 Jul 27, 2022
Fundamentus scrapy

Fundamentus_scrapy Baixa informacões que os outros scrapys do fundamentus não realizam. Para iniciar (python main.py), sera criado um arquivo chamado

Guilherme Silva Uchoa 1 Oct 24, 2021
Introduction to WebScraping Workshop - Semcomp 24 Beta

Extrair informações da internet de forma automatizada. Existem diversas maneiras de fazer isso, nesse tutorial vamos ver algumas delas, por meio de bibliotecas de python.

Luísa Moura 19 Sep 11, 2022
京东茅台抢购最新优化版本,京东秒杀,添加误差时间调整,优化了茅台抢购进程队列

京东茅台抢购最新优化版本,京东秒杀,添加误差时间调整,优化了茅台抢购进程队列

776 Jul 28, 2021
Automatically download and crop key information from the arxiv daily paper.

Arxiv daily 速览 功能:按关键词筛选arxiv每日最新paper,自动获取摘要,自动截取文中表格和图片。 1 测试环境 Ubuntu 16+ Python3.7 torch 1.9 Colab GPU 2 使用演示 首先下载权重baiduyun 提取码:il87,放置于code/Pars

HeoLis 20 Jul 30, 2022
A way to scrape sports streams for use with Jellyfin.

Sportyfin Description Stream sports events straight from your Jellyfin server. Sportyfin allows users to scrape for live streamed events and watch str

axelmierczuk 38 Nov 05, 2022
Raspi-scraper is a configurable python webscraper that checks raspberry pi stocks from verified sellers

Raspi-scraper is a configurable python webscraper that checks raspberry pi stocks from verified sellers.

Louie Cai 13 Oct 15, 2022
This is a python api to scrape search results from a url.

googlescrape Installation Installation is simple! # Stable version pip install googlescrape Examples from googlescrape import client scrapeClient=cli

1 Dec 15, 2022
Web scraping library and command-line tool for text discovery and extraction (main content, metadata, comments)

trafilatura: Web scraping tool for text discovery and retrieval Description Trafilatura is a Python package and command-line tool which seamlessly dow

Adrien Barbaresi 704 Jan 06, 2023
Script used to download data for stocks.

This script is useful for downloading stock market data for a wide range of companies specified by their respective tickers. The script reads in the d

Carmelo Gonzales 71 Oct 04, 2022
This is a web scraper, using Python framework Scrapy, built to extract data from the Deals of the Day section on Mercado Livre website.

Deals of the Day This is a web scraper, using the Python framework Scrapy, built to extract data such as price and product name from the Deals of the

David Souza 1 Jan 12, 2022
Download images from forum threads

Forum Image Scraper Downloads images from forum threads Only works with forums which doesn't require a login to view and have an incremental paginatio

9 Nov 16, 2022
This is a webscraper for a specific website

This is a webscraper for a specific website. It is tuned to extract the headlines of that website. With some little adjustments the webscraper is able to extract any part of the website.

Rahul Siyanwal 1 Dec 13, 2021
New World Market Scraper

Bean Seller A New Worlds market scraper. Deployment This must be installed on Windows as it uses the Windows api to do its stuff Install Prerequisites

4 Sep 21, 2022
Binance harvester - A Python 3 script to harvest data from the Binance socket stream and calculate popular TA indicators and produce lists of top trending coins

Binance harvester - A Python 3 script to harvest data from the Binance socket stream and calculate popular TA indicators and produce lists of top trending coins

68 Oct 08, 2022
Crawler do site Fundamentus.com com o uso do framework scrapy, tanto da aba detalhada como a de resumo.

Crawler do site Fundamentus.com com o uso do framework scrapy, tanto da aba detalhada como a de resumo. (Todas as infomações)

Guilherme Silva Uchoa 3 Oct 04, 2022
Scraping weather data using Python to receive umbrella reminders

A Python package which scrapes weather data from google and sends umbrella reminders to specified email at specified time daily.

Edula Vinay Kumar Reddy 1 Aug 23, 2022