Haphazard scripts for scraping bitcoin/bitcoin data from GitHub

Last update: Oct 12, 2022

Related tags

Web Crawling bitcoin-github-scrape

Overview

This is a quick-and-dirty tool used to scrape bitcoin/bitcoin pull request and commentary data.

Each output/<pr number> folder contains

comments.json: an aggregated list of both issue and review comments, in Github's original format
commits.json: a list of commit objects corresponding to the PR, in Github's original format
pr.json: the pull request object, in Github's original format
comments_abbrev.csv: abbreviated representation of each comment in CSV format
pr_abbrev.csv: abbreviated representation of the PR in CSV format
done: the datetime we retrieved the PR data

Limitations

Right now this doesn't really handle open PRs (or PRs that are expected to be updated) properly since it will not refresh data once the done sentinel is created. This could be fixed by comparing various timestamps to the done sentinel and overwriting.

Haphazard scripts for scraping bitcoin/bitcoin data from GitHub

Related tags

Overview

Limitations

See also

Owner

James O'Beirne

Python script who crawl first shodan page and check DBLTEK vulnerability

A Python package that scrapes Google News article data while remaining undetected by Google.

News, full-text, and article metadata extraction in Python 3. Advanced docs:

CreamySoup - a helper script for automated SourceMod plugin updates management.

This is a module that I had created along with my friend. It's a basic web scraping module

Get paper names from dblp.org

Web crawling framework based on asyncio.

A repository with scraping code and soccer dataset from understat.com.

京东抢茅台，秒杀成功很多次讨论，天猫抢购，赚钱交流等。

Github scraper app is used to scrape data for a specific user profile created using streamlit and BeautifulSoup python packages

Scrape data on SpaceX: Capsules, Rockets, Cores, Roadsters, SpaceX Info

NASA APOD Discord Bot - Fetches information from NASA APOD site.

An utility library to scrape data from TikTok, Instagram, Twitch, Youtube, Twitter or Reddit in one line!

WebScraper - A script that prints out a list of all EXTERNAL references in the HTML response to an HTTP/S request

Automated data scraper for Thailand COVID-19 data

Web Crawlers for Data Labelling of Malicious Domain Detection & IP Reputation Evaluation

This is python to scrape overview and reviews of companies from Glassdoor.

Google Developer Profile Badge Scraper

Scraping followers of an instagram account

A scalable frontier for web crawlers