0.-Webscrapping-using-python

Scraping Top Repositories for Topics on GitHub,
Web scraping is the process of extracting and parsing data from websites in an automated fashion using a computer program. It's a useful technique for creating datasets for research and learning. Follow these steps to build a web scraping project from scratch using Python and its ecosystem of libraries:
Pick a website and describe your objective
Browse through different sites and pick on to scrape. Check the "Project Ideas" section for inspiration.
Identify the information you'd like to scrape from the site. Decide the format of the output CSV file.
Summarize your project idea and outline your strategy in a Juptyer notebook.
Use the requests library to download web pages.
Inspect the website's HTML source and identify the right URLs to download.
Download and save web pages locally using the requests library.
Create a function to automate downloading for different topics/search queries.
Use Beautiful Soup to parse and extract information
Parse and explore the structure of downloaded web pages using Beautiful soup.
Use the right properties and methods to extract the required information.
Create functions to extract from the page into lists and dictionaries.
Use a REST API to acquire additional information if required.
Create CSV file(s) with the extracted information.
Create functions for the end-to-end process of downloading, parsing, and saving CSVs.
Execute the function with different inputs to create a dataset of CSV files.
Verify the information in the CSV files by reading them back using Pandas.
Document and share your work
Add proper headings and documentation in your Jupyter notebook.
Write a blog post about your project and share it online.

Scraping Top Repositories for Topics on GitHub,

Related tags

Overview

0.-Webscrapping-using-python

Owner

Dev Aravind D Satprem

This Spider/Bot is developed using Python and based on Scrapy Framework to Fetch some items information from Amazon

WebScrapping Project - G1 Latest News

Meme-videos - Scrapes memes and turn them into a video compilations

Python scrapper scrapping torrent website and download new movies Automatically.

Crawler job that scrapes comments from social media posts and saves them in a S3 bucket.

A web scraper that exports your entire WhatsApp chat history.

Telegram Group Scrapper

:arrow_double_down: Dumb downloader that scrapes the web

爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说、招标网、采购网、小红书》

Library to scrape and clean web pages to create massive datasets.

Here I provide the source code for doing web scraping using the python library, it is Selenium.

A package designed to scrape data from Yahoo Finance.

An Web Scraping API for MDL(My Drama List) for Python.

Displays market info for the LUNI token on the Terra Blockchain

CRI Scrape is a tool for get general info about Italian Red Cross in GAIA Platform

A simple flask application to scrape gogoanime website.

Scrape all the media from an OnlyFans account - Updated regularly

Binance Smart Chain Contract Scraper + Contract Evaluator

A simple proxy scraper that utilizes the requests module in python.

A high-level distributed crawling framework.