Newsscraper - A simple Python 3 module to get crypto or news articles and their content from various RSS feeds.

Overview

NewsScraper

A simple Python 3 module to get crypto or news articles and their content from various RSS feeds.

🔧 Installation

  1. Clone the repo locally.
  2. Use the package manager pip to install the requirements.
pip install -r requirements.txt

Basic Usage

import NewsScraper

all_data = NewsScraper.fetch_all()
news_data = NewsScraper.fetch_news_data()
crypto_data = NewsScraper.fetch_crypto_data()

fetch_all()

Returns a set of NewsScraper.Result containing fetched results from all available RSS feeds

Can include categories: GLOBAL, US, EU, CRYPTO, BLOCKCHAIN, BTC, ETH, LTC.

fetch_news_data()

Returns a set of NewsScraper.Result containing fetched results from CNN, ABC News, Yahoo News, Fox News RSS feeds

Can include categories: GLOBAL, US, EU.

fetch_crypto_data()

Returns a set of NewsScraper.Result containing fetched results from CoinJournal, Crypto Currency News RSS feeds.

Can include categories: CRYPTO, BLOCKCHAIN, BTC, ETH, LTC.

🔨 Advanced Usage

NewsScraper.Result class

A class used to represent a returned article.

Attributes
  • context : str

    A string describing the category of the article.

    ex. "GLOBAL", "US", "BLOCKCHAIN", "BTC".

  • title : str

    A string containing the name of the article.

  • summary : str

    A string containing the summary of the article.

    NOTE: sometimes it can have the value of "", because the RSS feed didn't provide a summary.

  • content : str

    A string containing the content of the article.

Methods
  • Result.json()

    Returns a dictionary with the attributes of the class formatted in JSON.

    ex.

{
  "context": "global",
  "title": "title of the article",
  "summary": "summary of the article",
  "content": "content of the article"
}

News RSS Feeds

All of these functions return a set of NewsScraper.Result containing fetched results of the described RSS feeds.

fetch_abc()
fetch_cnn()
fetch_yahoo()
fetch_fox_news()

Can include categories: GLOBAL, US, EU.

Alternatively, you can use fetch_news_data() to receive results from all of them.


Crypto RSS Feeds

All of these functions return a set of NewsScraper.Result containing fetched results of the described RSS feeds.

fetch_coinjournal()
fetch_cryptocurrencynews()

Can include categories: CRYPTO, BLOCKCHAIN, BTC, ETH, LTC.

Alternatively, you can use fetch_news_data() to receive results from all of them.

🤝 Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

📝 License

This project is licensed under the MIT license.

Owner
Rokas
Rokas
Footballmapies - Football mapies for learning webscraping and use of gmplot module in python

Footballmapies - Football mapies for learning webscraping and use of gmplot module in python

1 Jan 28, 2022
Generate a repository with mirror links for DriveDroid app

DriveDroid Repository Generator Generate a repository for the app that allow boot a PC using ISO files stored on your Android phone Check also an offi

Evgeny 11 Nov 19, 2022
A dead simple crawler to get books information from Douban.

Introduction A dead simple crawler to get books information from Douban. Pre-requesites Python 3 Install dependencies from requirements.txt (Optional)

Yun Wang 1 Jan 10, 2022
This is a web crawler that works on employ email data by gmane.org and visualizes it in different ways.

crawler_to_visual_gmane Analyzing an EMAIL Archive from gmane and vizualizing the data using the D3 JavaScript library. This is a set of tools that al

Saim Zafar 1 Dec 20, 2021
Python web scrapper

Website scrapper Web scrapping project in Python. Created for learning purposes. Start Install python Update configuration with websites Launch script

Nogueira Vitor 1 Dec 19, 2021
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster

This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.

IST Research 1.1k Jan 06, 2023
Basic-html-scraper - A complete how to of web scraping with Python for beginners

basic-html-scraper Code from YT Video This video includes a complete how to of w

John 12 Oct 22, 2022
A repository with scraping code and soccer dataset from understat.com.

UNDERSTAT - SHOTS DATASET As many people interested in soccer analytics know, Understat is an amazing source of information. They provide Expected Goa

douglasbc 48 Jan 03, 2023
Web-scraping - A bot using Python with BeautifulSoup that scraps IRS website by form number and returns the results as json

Web-scraping - A bot using Python with BeautifulSoup that scraps IRS website (prior form publication) by form number and returns the results as json. It provides the option to download pdfs over a ra

1 Jan 04, 2022
Scrap the 42 Intranet's elearning videos in a single click

42intra_scraper Scrap the 42 Intranet's elearning videos in a single click. Why you would want to use it ? Adjust speed at your convenience. (The intr

Noufel 5 Oct 27, 2022
a Scrapy spider that utilizes Postgres as a DB, Squid as a proxy server, Redis for de-duplication and Splash to render JavaScript. All in a microservices architecture utilizing Docker and Docker Compose

This is George's Scraping Project To get started cd into the theZoo file and run: chmod +x script.sh then: ./script.sh This will spin up a Postgres co

George Reyes 7 Nov 27, 2022
Collection of code files to scrap different kinds of websites.

STW-Collection Scrap The Web Collection; blog posts. This repo contains Scrapy sample code to scrap the following kind of websites: Do you want to lea

Tapasweni Pathak 15 Jun 08, 2022
download NCERT books using scrapy

download_ncert_books download NCERT books using scrapy Downloading Books: You can either use the spider by cloning this repo and following the instruc

1 Dec 02, 2022
Scraping followers of an instagram account

ScrapInsta A script to scraping data from Instagram Install First of all you can run: pip install scrapinsta After that you need to install these requ

Matheus Kolln 1 Sep 05, 2021
Web Scraping Practica With Python

Web-Scraping-Practica Integrants: Guillem Vidal Pallarols. Lídia Bandrés Solé Fitxers: Aquest document és el primer que trobem. A continuació trobem u

2 Nov 08, 2021
Telegram group scraper tool

Telegram Group Scrapper

Wahyusaputra 2 Jan 11, 2022
Discord webhook spammer with proxy support and proxy scraper

Discord webhook spammer with proxy support and proxy scraper

3 Feb 27, 2022
哔哩哔哩爬取器:以个人为中心

Open Bilibili Crawer 哔哩哔哩是一个信息非常丰富的社交平台,我们基于此构造社交网络。在该网络中,节点包括用户(up主),以及视频、专栏等创作产物;关系包括:用户之间,包括关注关系(following/follower),回复关系(评论区),转发关系(对视频or动态转发);用户对创

Boshen Shi 3 Oct 21, 2021
A Pixiv web crawler module

Pixiv-spider A Pixiv spider module WARNING It's an unfinished work, browsing the code carefully before using it. Features 0004 - Readme.md updated, co

Uzuki 1 Nov 14, 2021
Web scrapping

Project Setup Table of Contents Project Setup Table of Contents Run project locally Install Requirements Run script Run project locally Install Requir

Charles 3 Feb 04, 2022