爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说、招标网、采购网、小红书》

Overview

lxSpider

爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说网站、招标采购网》

简介

  • csdn csdn
  • 时光荏苒,记不清写了多少案例了。作者文章发布在csdn,代码随后往github上更新。csdn部分文章为收费案例,合理订阅。

声明

  • 本库以教学为基准、本库提供的可操作性不得用于任何商业用途和违法违规场景。

  • 作者对任何原因在使用本库中提供的代码和策略时可能对用户自己或他人造成的任何形式的损失和伤害不承担责任。

  • 因本库引起的或与之有关的任何争议,各方应友好协商解决,协商不成的任何后果与作者无关。


专栏

网络爬虫基础 : 适合有python语法基础 准备学爬虫的同学

web逆向基础 : 有爬虫经验即可(包含猿人学爬虫题目解析)

安卓逆向基础 :工具介绍、逆向记录、案例分享

爬虫案例合集 :付费专栏、经典案例、持续更新


目录

博客

推荐

交流

avatar

You might also like...
Releases(快手弹幕采集工具)
  • 快手弹幕采集工具(Jan 30, 2021)

    使用说明:

    • 1、启动dist目录下的run.exe程序。
    • 2、填入主播uid,你的cookie,房间id
    • 3、点击启动后,等待即可,不可重复点击。
    • 4、需要确认主播当前是否还在直播。

    参数获取:

    主播uid: 浏览器上的网址最后一个参数。

    比如网址为: https://live.kuaishou.com/u/yingjia2019

    主播的uid为: yingjia2019

    你的cookie:

    • 1、打开控制台,鼠标右键点击审查元素或者按F12.
    • 2、点击控制台的Network。
    • 3、刷新页面,可已按F5刷新
    • 4、找到和主播uid一样html文件,然后点击右侧的headers
    • 5、鼠标划到最下面找到cookie一行。复制里面的did=web_xxxxxxxxxxxxxx;
    • 6、需要在软件上填入的cookie是 web_xxxxxxxxxxxxxx

    房间id:

    • 1、点击控制台的 Elements,按ctrl+F,打开搜索框。输入: live-stream-id
    • 2、复制 live-stream-id="Zo9Upaz8w90"
    • 3、要输入的房间id是 Zo9Upaz8w90

    运行时最好保持页面打开,关闭页面后过一段时间会导致cookie失效。

    此工具以学习为主,禁止滥用

    Source code(tar.gz)
    Source code(zip)
    default.rar(21.47 MB)
  • 小说下载器(Feb 2, 2021)

    简介

    1、小说下载(优势:速度快,直接从网络上搜集完整txt文件速度快) 2、在线小说爬取(优势:资源全,已上架的小说几乎都能找到)

    特别声明:

    • 本脚本仅用于测试和学习研究,禁止用于商业用途,不能保证其合法性,准确性,完整性和有效性,请根据情况自行判断。

    • 本项目内所有资源文件,禁止任何公众号、自媒体进行任何形式的转载、发布。

    • 本项目内任何脚本问题概不负责,包括但不限于由任何脚本错误导致的任何损失或损害.

    • 请勿将项目的任何内容用于商业或非法目的,否则后果自负。

    • 本项目遵循GPL-3.0 License协议,如果本特别声明与GPL-3.0 License协议有冲突之处,以本特别声明为准。

    Source code(tar.gz)
    Source code(zip)
    default.zip(44.16 MB)
Owner
lx
Every noble work is at first impossible.
lx
Web-scraping - A bot using Python with BeautifulSoup that scraps IRS website by form number and returns the results as json

Web-scraping - A bot using Python with BeautifulSoup that scraps IRS website (prior form publication) by form number and returns the results as json. It provides the option to download pdfs over a ra

1 Jan 04, 2022
A Very simple free proxy list scraper.

Scrappp A Very simple free proxy list scraper, made in python The tool scrape proxy from diffrent sites and api's. Screenshots About the script !!! RE

Joji aka Moncef 12 Oct 27, 2022
Introduction to WebScraping Workshop - Semcomp 24 Beta

Extrair informações da internet de forma automatizada. Existem diversas maneiras de fazer isso, nesse tutorial vamos ver algumas delas, por meio de bibliotecas de python.

Luísa Moura 19 Sep 11, 2022
Incredibly fast crawler designed for OSINT.

Photon Incredibly fast crawler designed for OSINT. Photon Wiki • How To Use • Compatibility • Photon Library • Contribution • Roadmap Key Features Dat

Somdev Sangwan 9.3k Jan 02, 2023
a way to scrape a database of all of the isef projects

ISEF Database This is a simple web scraper which gets all of the projects and abstract information from here. My goal for this is for someone to get i

William Kaiser 1 Mar 18, 2022
This program will help you to properly scrape all data from a specific website

This program will help you to properly scrape all data from a specific website

MD. MINHAZ 0 May 15, 2022
Web Scraping COVID 19 Meta Portal with Python

Web-Scraping-COVID-19-Meta-Portal-with-Python - Requests API and Beautiful Soup to scrape real-time COVID statistics from worldometer website and perform data cleaning and visual analysis in Jupyter

Aarif Munwar Jahan 1 Jan 04, 2022
Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages.

Video Games Web Scraper Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages. This

Albert Marrero 1 Jan 12, 2022
Download images from forum threads

Forum Image Scraper Downloads images from forum threads Only works with forums which doesn't require a login to view and have an incremental paginatio

9 Nov 16, 2022
A simple python web scraper.

Dissec A simple python web scraper. It gets a website and its contents and parses them with the help of bs4. Installation To install the requirements,

11 May 06, 2022
Complete pipeline for crawling online newspaper article.

Complete pipeline for crawling online newspaper article. The articles are stored to MongoDB. The whole pipeline is dockerized, thus the user does not need to worry about dependencies. Additionally, d

newspipe 4 May 27, 2022
CreamySoup - a helper script for automated SourceMod plugin updates management.

CreamySoup/"Creamy SourceMod Updater" (or just soup for short), a helper script for automated SourceMod plugin updates management.

3 Jan 03, 2022
Crawler in Python 3.7, 3.8. 3.9. Pypy3

Description Python Crawler written Python 3. (Supports major Python releases Python3.6, Python3.7 and Python 3.8) Installation and Use Setup VirtualEn

Vinit Kumar 2 Mar 12, 2022
DaProfiler allows you to get emails, social medias, adresses, works and more on your target using web scraping and google dorking techniques

DaProfiler allows you to get emails, social medias, adresses, works and more on your target using web scraping and google dorking techniques, based in France Only. The particularity of this program i

Dalunacrobate 347 Jan 07, 2023
Snowflake database loading utility with Scrapy integration

Snowflake Stage Exporter Snowflake database loading utility with Scrapy integration. Meant for streaming ingestion of JSON serializable objects into S

Oleg T. 0 Dec 06, 2021
A Telegram crawler to search groups and channels automatically and collect any type of data from them.

Introduction This is a crawler I wrote in Python using the APIs of Telethon months ago. This tool was not intended to be publicly available for a numb

39 Dec 28, 2022
A modern CSS selector implementation for BeautifulSoup

Soup Sieve Overview Soup Sieve is a CSS selector library designed to be used with Beautiful Soup 4. It aims to provide selecting, matching, and filter

Isaac Muse 151 Dec 23, 2022
Twitter Scraper

Twitter's API is annoying to work with, and has lots of limitations — luckily their frontend (JavaScript) has it's own API, which I reverse–engineered. No API rate limits. No restrictions. Extremely

Tayyab Kharl 45 Dec 30, 2022
Web Scraping Framework

Grab Framework Documentation Installation $ pip install -U grab See details about installing Grab on different platforms here http://docs.grablib.

2.3k Jan 04, 2023
Telegram group scraper tool

Telegram Group Scrapper

Wahyusaputra 2 Jan 11, 2022