A simple, configurable and expandable combined shop scraper to minimize the costs of ordering several items

Last update: Dec 13, 2021

Overview

combined-shop-scraper

A simple, configurable and expandable combined shop scraper to minimize the costs of ordering several items.

Features

Define an input file components.json with components to be scraped and the source urls
Find the cheapest order combination including the shipping prices
Get alarm prices when single components are below a defined price
Easily expand for new shops (scraping basic know-how required). Default basic support for notebooksbilliger, cyberport and future-x

Usage

JSON file definition

The default name of the input JSON file is components.json and must be located in the same folder as scraper.py. This is the basic structure of the file:

{
  "component1": {
    "alarm_price": 260,
    "quantity": 1,
    "urls": [
      "https://www.someshop.com/component1",
      "https://www.someshop.com/component1-alternative",
      "https://www.anothershop.com/component1-alternative"]
  },
  "component2": {
    "urls": [
      "https://www.someshop.com/component2",
      "https://www.anothershop.com/component2",
      "https://www.onemoreshop.com/component2"]
  }

The component name and at least one url are mandatory. It is possible to add several urls from the same shop for the same component if there are some alternatives for this. The quantity of each component defaults to 1, the alarm price is optional.

Execution

Just call the script scraper.py from within the folder, so the components.json file can be found. It will print an overview of the ideal order to minimize the overall cost. The program runs just once and does not keep tracking prices in the background. As usual with scraping, be gentle and fair and don't abuse this program.

Addition of new shops

If you want to add a new shop, you need to edit the file shops.py and:

Enter the significant part of the shop url in the method Shop._get_shops_dict and define a new class type (child of Shop)
Implement the methods _process_soup and get_shipping_cost for the new class. Use the existing classes as reference for the data you need to scrap.
Add your new urls to the input file!

License

See the LICENSE for license details.

A simple, configurable and expandable combined shop scraper to minimize the costs of ordering several items

Related tags

Overview

combined-shop-scraper

Features

Usage

JSON file definition

Execution

Addition of new shops

License

Owner

Divar.ir Ads scrapper

Dude is a very simple framework for writing web scrapers using Python decorators

a high-performance, lightweight and human friendly serving engine for scrapy

A Python Oriented tool to Scrap WhatsApp Group Link using Google Dork it Scraps Whatsapp Group Links From Google Results And Gives Working Links.

Scrapes mcc-mnc.com and outputs 3 files with the data (JSON, CSV & XLSX)

基于Github Action的定时HITsz疫情上报脚本，开箱即用

Docker containerized Python Flask API that uses selenium to scrape and interact with websites

Simply scrape / download all the media from an fansly account.

一个m3u8视频流下载脚本

薅薅乐 - JD 测试脚本

Amazon scraper using scrapy, a python framework for crawling websites.

This program will help you to properly scrape all data from a specific website

A simple proxy scraper that utilizes the requests module in python.

Using Selenium with Python to Web Scrap Popular Youtube Tech Channels.

Web and PDF Scraper Refactoring

Create crawler get some new products with maximum discount in banimode website

Free-Game-Scraper is a useful script that allows you to track down free games and DLCs on many platforms.

一款利用Python来自动获取QQ音乐上某个歌手所有歌曲歌词的爬虫软件

Binance Smart Chain Contract Scraper + Contract Evaluator

A Python Covid-19 cases tracker that scrapes data off the web and presents the number of Cases, Recovered Cases, and Deaths that occurred because of the pandemic.