Collection of code files to scrap different kinds of websites.

Last update: Jun 08, 2022

Related tags

Web Crawling STW-Collection

Overview

STW-Collection

Scrap The Web Collection; blog posts.

This repo contains Scrapy sample code to scrap the following kind of websites:

Do you want to learn Scrapy? ScrapScrapy is gonna be your first scrapy project in that case.
If you want to scrap a simple website without any javascript or AJAX calls,you can have a look at this project. This uses CrawlSpider.
If you want to use selenium with scrapy, have a look at this project.
You can refer this project, if you want to save to Django DB as you scrap.

Owner

Tapasweni Pathak

https://paper.dropbox.com/doc/The-Sequence--A_o_3HyYEgkoBSsxzXpMDRl2Ag-exQXZYWC9EN4RurEJsP7h Busy. No news/emails/anything media from 20 Dec - till release.

GitHub Repository http://tapasweni-pathak.github.io/STW-Collection

A web crawler for recording posts in "sina weibo"

Web Crawler for "sina weibo" A web crawler for recording posts in "sina weibo" Introduction This script helps collect attributes of posts in "sina wei

4 Aug 20, 2022

A high-level distributed crawling framework.

Cola: high-level distributed crawling framework Overview Cola is a high-level distributed crawling framework, used to crawl pages and extract structur

1.5k Jan 04, 2023

Screen scraping and web crawling framework

Pomp Pomp is a screen scraping and web crawling framework. Pomp is inspired by and similar to Scrapy, but has a simpler implementation that lacks the

61 Jun 21, 2021

Library to scrape and clean web pages to create massive datasets.

lazynlp A straightforward library that allows you to crawl, clean up, and deduplicate webpages to create massive monolingual datasets. Using this libr

2.1k Jan 06, 2023

A dead simple crawler to get books information from Douban.

Introduction A dead simple crawler to get books information from Douban. Pre-requesites Python 3 Install dependencies from requirements.txt (Optional)

1 Jan 10, 2022

A simple python web scraper.

Dissec A simple python web scraper. It gets a website and its contents and parses them with the help of bs4. Installation To install the requirements,

11 May 06, 2022

Minecraft Item Scraper

Minecraft Item Scraper To run, first ensure you have the BeautifulSoup module: pip install bs4 Then run, python minecraft_items.py folder-to-save-ima

1 Dec 29, 2021

Web crawling framework based on asyncio.

Web crawling framework for everyone. Written with asyncio, uvloop and aiohttp. Requirements Python3.5+ Installation pip install gain pip install uvloo

2k Jan 05, 2023

HappyScrapper - Google news web scrapper with python

HappyScrapper ~ Google news web scrapper INSTALLATION ♦ Clone the repository ♦ O

0 Nov 07, 2022

A simple code to fetch comments below an Instagram post and save them to a csv file

fetch_comments A simple code to fetch comments below an Instagram post and save them to a csv file usage First you have to enter your username and pas

2 Jul 14, 2022

mlscraper: Scrape data from HTML pages automatically with Machine Learning

🤖 Scrape data from HTML websites automatically with Machine Learning

798 Dec 29, 2022

淘宝茅台抢购最新优化版本，淘宝茅台秒杀，优化了茅台抢购线程队列

118 Dec 16, 2022

A tool to easily scrape youtube data using the Google API

YouTube data scraper To easily scrape any data from the youtube homepage, a youtube channel/user, search results, playlists, and a single video itself

7 Dec 03, 2022

爱奇艺会员,腾讯视频,哔哩哔哩,百度,各类签到

My-Actions 个人收集并适配Github Actions的各类签到大杂烩不要fork了 ⭐️ star就行使用方式新建仓库并同步代码点击Settings - Secrets - 点击绿色按钮 (如无绿色按钮说明已激活。直接到下一步。) 新增 new secret 并设置 Secr

280 Dec 30, 2022

Automatically scrapes all menu items from the Taco Bell website

Automatically scrapes all menu items from the Taco Bell website. Returns as PANDAS dataframe.

2 Jan 15, 2022

Shopee Scraper - A web scraper in python that extract sales, price, avaliable stock, location and more of a given seller in Brazil

Shopee Scraper A web scraper in python that extract sales, price, avaliable stock, location and more of a given seller in Brazil. The project was crea

5 Nov 29, 2022

Collection of code files to scrap different kinds of websites.

Related tags

Overview

STW-Collection

Owner

Tapasweni Pathak

A web crawler for recording posts in "sina weibo"

A high-level distributed crawling framework.

Screen scraping and web crawling framework

Library to scrape and clean web pages to create massive datasets.

A dead simple crawler to get books information from Douban.

A simple python web scraper.

Minecraft Item Scraper

Web crawling framework based on asyncio.

HappyScrapper - Google news web scrapper with python

A simple code to fetch comments below an Instagram post and save them to a csv file

mlscraper: Scrape data from HTML pages automatically with Machine Learning

淘宝茅台抢购最新优化版本，淘宝茅台秒杀，优化了茅台抢购线程队列

A tool to easily scrape youtube data using the Google API

爱奇艺会员,腾讯视频,哔哩哔哩,百度,各类签到

Automatically scrapes all menu items from the Taco Bell website

Shopee Scraper - A web scraper in python that extract sales, price, avaliable stock, location and more of a given seller in Brazil

原神爬虫抓取原神界面圣遗物信息

Snowflake database loading utility with Scrapy integration

让中国用户使用git从github下载的速度提高1000倍!

Scrapy uses Request and Response objects for crawling web sites.

Collection of code files to scrap different kinds of websites.

Related tags

Overview

STW-Collection

Owner

Tapasweni Pathak

A web crawler for recording posts in "sina weibo"

A high-level distributed crawling framework.

Screen scraping and web crawling framework

Library to scrape and clean web pages to create massive datasets.

A dead simple crawler to get books information from Douban.

A simple python web scraper.

Minecraft Item Scraper

Web crawling framework based on asyncio.

HappyScrapper - Google news web scrapper with python

A simple code to fetch comments below an Instagram post and save them to a csv file

mlscraper: Scrape data from HTML pages automatically with Machine Learning

淘宝茅台抢购最新优化版本，淘宝茅台秒杀，优化了茅台抢购线程队列

A tool to easily scrape youtube data using the Google API

爱奇艺会员,腾讯视频,哔哩哔哩,百度,各类签到

Automatically scrapes all menu items from the Taco Bell website

Shopee Scraper - A web scraper in python that extract sales, price, avaliable stock, location and more of a given seller in Brazil

原神爬虫 抓取原神界面圣遗物信息

Snowflake database loading utility with Scrapy integration

让中国用户使用git从github下载的速度提高1000倍!

Scrapy uses Request and Response objects for crawling web sites.

原神爬虫抓取原神界面圣遗物信息