A dead simple crawler to get books information from Douban.

Last update: Jan 10, 2022

Related tags

Web Crawling douban-books-crawler

Overview

Introduction

A dead simple crawler to get books information from Douban.

Pre-requesites

Python 3
Install dependencies from requirements.txt
(Optional) Install Anaconda to handle environment

Usage

Run get_tags to fetch all the trending tags.

# This will generate a file tags.csv
python app.py get_tags

Run crawl_books to start crawling the books by the tags from the previous step.

python app.py crawl_books -i tags.csv

Certainly, you can create the tags.csv without using the get_tags script. You might want to make sure the tags you specified can lead to any actual result of books.

License

MIT © mogita

Owner

Yun Wang

GitHub Repository

An utility library to scrape data from TikTok, Instagram, Twitch, Youtube, Twitter or Reddit in one line!

Social Media Scraper An utility library to scrape data from TikTok, Instagram, Twitch, Youtube, Twitter or Reddit in one line! Go to the website » Vie

2 Aug 03, 2022

PyQuery-based scraping micro-framework.

demiurge PyQuery-based scraping micro-framework. Supports Python 2.x and 3.x. Documentation: http://demiurge.readthedocs.org Installing demiurge $ pip

109 Jul 20, 2022

👁️ Tool for Data Extraction and Web Requests.

httpmapper 👁️ Project • Technologies • Installation • How it works • License Project 🚧 For educational purposes. This is a project that I developed,

15 Dec 05, 2021

An automated, headless YouTube Watcher and Scraper

Searches YouTube, queries recommended videos and watches them. All fully automated and anonymised through the Tor network. The project consists of two independently usable components, the YouTube aut

44 Oct 18, 2022

AssistScraper - program for /r/nba to use to find list of all players a player assisted and how many assists each player recieved

5 Nov 25, 2021

Scrape and display grades onto the console

WebScrapeGrades About The Project This Project is a personal project where I learned how to webscrape using python requests. Being able to get request

1 Oct 23, 2021

基于Github Action的定时HITsz疫情上报脚本，开箱即用

HITsz Daily Report 基于 GitHub Actions 的「HITsz 疫情系统」访问入口定时自动上报脚本，开箱即用。感谢 @JellyBeanXiewh 提供原始脚本和 idea。感谢 @bugstop 对脚本进行重构并新增 Easy Connect 校内代理访问。

56 Nov 27, 2022

Simple tool to scrape and download cross country ski timings and results from live.skidor.com

LiveSkidorDownload Simple tool to scrape and download cross country ski timings

0 Jan 07, 2022

Telegram group scraper tool

Telegram Group Scrapper

2 Jan 11, 2022

WebScrapping Project - G1 Latest News

Web Scrapping com Python Esse projeto consiste em um código para o usuário buscar as últimas nóticias sobre um termo qualquer, no site G1. Para esse p

2 Feb 13, 2022

TarkovScrappy - A nifty little bot that lets you know if a queried item might be required for a quest at some point in the land of Tarkov!

TarkovScrappy A nifty little bot that lets you know if a queried item might be required for a quest at some point in the land of Tarkov! Hideout items

2 Apr 11, 2022

A dead simple crawler to get books information from Douban.

Related tags

Overview

Introduction

Pre-requesites

Usage

License

Owner

Yun Wang

An utility library to scrape data from TikTok, Instagram, Twitch, Youtube, Twitter or Reddit in one line!

PyQuery-based scraping micro-framework.

👁️ Tool for Data Extraction and Web Requests.

An automated, headless YouTube Watcher and Scraper

AssistScraper - program for /r/nba to use to find list of all players a player assisted and how many assists each player recieved

Scrape and display grades onto the console

基于Github Action的定时HITsz疫情上报脚本，开箱即用

Simple tool to scrape and download cross country ski timings and results from live.skidor.com

Telegram group scraper tool

WebScrapping Project - G1 Latest News

TarkovScrappy - A nifty little bot that lets you know if a queried item might be required for a quest at some point in the land of Tarkov!

This program will help you to properly scrape all data from a specific website

A crawler of doubamovie

Console application for downloading images from Reddit in Python

Creating Scrapy scrapers via the Django admin interface

This is a sport analytics project that combines the knowledge of OOP and Webscraping

Script for scrape user data like "id,username,fullname,followers,tweets .. etc" by Twitter's search engine .

Explore scraping with BeautifulSoup!

for those who dont want to pay $10/month for high school game footage with ads

This is a web crawler that works on employ email data by gmane.org and visualizes it in different ways.