This script is intended to crawl license information of repositories through the GitHub API.

Last update: Oct 25, 2022

Related tags

Overview

GithubLicenseCrawler

This script is intended to crawl license information of repositories through the GitHub API. Taking a csv file with requirements.txt format the script will return a csv with the associated license information.

Input

Input file is expected to be a requirements.txt Expected format looks like this, for two exemplary repositories:

HeartSeg-Dataset==0.0.1
DeepDive==0.0.1

Output

Output file will be generated on the fly, named licenses.csv and the columns depict:

Running the script should look like this:

Contact and Contribute

[email protected] Obviously the Github API is way more powerful than what has been done here. Feel free to extend this code or preferably directly contribute here.

Owner

schutera

GitHub Repository

🐞 Douban Movie / Douban Book Scarpy

Python3-based Douban Movie/Douban Book Scarpy crawler for cover downloading + data crawling + review entry.

1 Dec 03, 2022

A Python Covid-19 cases tracker that scrapes data off the web and presents the number of Cases, Recovered Cases, and Deaths that occurred because of the pandemic.

1 Nov 13, 2021

This script is intended to crawl license information of repositories through the GitHub API.

Related tags

Overview

GithubLicenseCrawler

Input

Output

Running the script should look like this:

Contact and Contribute

Owner

schutera

🐞 Douban Movie / Douban Book Scarpy

A Python Covid-19 cases tracker that scrapes data off the web and presents the number of Cases, Recovered Cases, and Deaths that occurred because of the pandemic.

Scrapy-soccer-games - Scraping information about soccer games from a few websites

Scrap the 42 Intranet's elearning videos in a single click

A Python library for automating interaction with websites.

Newsscraper - A simple Python 3 module to get crypto or news articles and their content from various RSS feeds.

Screen scraping and web crawling framework

A Telegram crawler to search groups and channels automatically and collect any type of data from them.

Crawler in Python 3.7, 3.8. 3.9. Pypy3

This is a sport analytics project that combines the knowledge of OOP and Webscraping

Scrape Twitter for Tweets

WebScrapping Project - G1 Latest News

淘宝、天猫半价抢购，抢电视、抢茅台，干死黄牛党

Scrapy, a fast high-level web crawling & scraping framework for Python.

A leetcode scraper to compile all questions in leetcode free tier to text file. pdf also available.

NASA APOD Discord Bot - Fetches information from NASA APOD site.

A dead simple crawler to get books information from Douban.

This code will be able to scrape movies from a movie website and also provide download links to newly uploaded movies.

A low-code tool that generates python crawler code based on curl or url

Script used to download data for stocks.