Web Scraping COVID 19 Meta Portal with Python

Overview

Web-Scraping-COVID-19-Meta-Portal-with-Python

Requests API and Beautiful Soup to scrape real-time COVID statistics from worldometer website and perform data cleaning and visual analysis in Jupyter notebook.

Data Preparation Notebook

In the first module, web scraping techniques using requests, beautifulsoup packages are utilized to collect and manipulate COVID related data from the worldometer website

The notebook has a total of five code blocks.

The first four code blocks provide the following data:

  1. Summary Data for ALL Global COVID Cases
  2. Summary Data for ACTIVE Global COVID Cases
  3. Summary Data for CLOSED Global COVID Cases
  4. Tabular Data for COVID Cases by Country

The fifth and final code block provides an interactive interface for exporting each of these four tables

Data Analysis Notebook

In the second module, data analysis techniques using pandas, numpy, seaborn and statsmodels packages are utilized to collect effective insights from the data and plot necessary graphs. The raw csv data is the same table we collected in Part A of the project taken from the worldometer website regarding COVID cases tabulated by country.

The notebook has a total of twelve code blocks.

  1. Importing a CSV file, reading it and counting no. of rows and columns
  2. Using the to_numeric method to ensure all numerical columns get passed as numeric
  3. Using the describe function to display and analyze basic statistical data on the numerical columns of the imported data
  4. Working with a smaller set of imported data - Top 20 countries with most cases
  5. Horizontal bar chart to analyze total cases in the top 20 countries
  6. Vertical bar chart to analyze total deaths in the top 20 countries with most cases
  7. Distribution plot to analyze spread of data for Deaths/1M Population of the 20 countries
  8. Using the describe function to display basic statistical data on the numerical columns of the REDUCED dataset
  9. Comparing and analyzing mean and standard deviation between population of the Full dataset and the Reduced dataset
  10. Using regression scatter plot to check for data independence between tests/million people and the size of the population
  11. Finding and analyzing correlations between the variables in the dataset
  12. Applying a statistical model to collect useful information about Total Cases and Total Deaths in the full data set

-- Aarif M Jahan -- May 08, 2021

Owner
Aarif Munwar Jahan
Aarif Munwar Jahan
爱奇艺会员,腾讯视频,哔哩哔哩,百度,各类签到

My-Actions 个人收集并适配Github Actions的各类签到大杂烩 不要fork了 ⭐️ star就行 使用方式 新建仓库并同步代码 点击Settings - Secrets - 点击绿色按钮 (如无绿色按钮说明已激活。直接到下一步。) 新增 new secret 并设置 Secr

280 Dec 30, 2022
Extract gene TSS site form gencode/ensembl/gencode database GTF file and export bed format file.

GetTss python Package extract gene TSS site form gencode/ensembl/gencode database GTF file and export bed format file. Install $ pip install GetTss Us

laojunjun 6 Nov 21, 2022
Scrape and display grades onto the console

WebScrapeGrades About The Project This Project is a personal project where I learned how to webscrape using python requests. Being able to get request

Cyrus Baybay 1 Oct 23, 2021
An experiment to deploy a serverless infrastructure for a scrapy project.

Serverless Scrapy project This project aims to evaluate the feasibility of an architecture based on serverless technology for a web crawler using scra

José Ferraz Neto 5 Jul 08, 2022
script to scrape direct download links (ddls) from google drive index.

bhadoo Google Personal/Shared Drive Index scraper. A small script to scrape direct download links (ddls) of downloadable files from bhadoo google driv

sαɴᴊɪᴛ sɪɴʜα 53 Dec 16, 2022
FilmMikirAPI - A simple rest-api which is used for scrapping on the Kincir website using the Python and Flask package

FilmMikirAPI - A simple rest-api which is used for scrapping on the Kincir website using the Python and Flask package

UserGhost411 1 Nov 17, 2022
Parse feeds in Python

feedparser - Parse Atom and RSS feeds in Python. Copyright 2010-2020 Kurt McKee Kurt McKee 1.5k Dec 30, 2022

Simple proxy scraper made by using ProxyScrape's api.

What is Moon? Moon is a lightweight and fast proxy scraper made by using ProxyScrape's api. What can i do with this? You can use proxies for varietys

1 Jul 04, 2022
This repo has the source code for the crawler and data crawled from auto-data.net

This repo contains the source code for crawler and crawled data of cars specifications from autodata. The data has roughly 45k cars

Tô Đức Anh 5 Nov 22, 2022
Automatically download and crop key information from the arxiv daily paper.

Arxiv daily 速览 功能:按关键词筛选arxiv每日最新paper,自动获取摘要,自动截取文中表格和图片。 1 测试环境 Ubuntu 16+ Python3.7 torch 1.9 Colab GPU 2 使用演示 首先下载权重baiduyun 提取码:il87,放置于code/Pars

HeoLis 20 Jul 30, 2022
抖音批量下载用户所有无水印视频

Douyincrawler 抖音批量下载用户所有无水印视频 Run 安装python3, 安装依赖

28 Dec 08, 2022
Get-web-images - A python code that get images from any site

image retrieval This is a python code to retrieve an image from the internet, a

CODE 1 Dec 30, 2021
LSpider 一个为被动扫描器定制的前端爬虫

LSpider LSpider - 一个为被动扫描器定制的前端爬虫 什么是LSpider? 一款为被动扫描器而生的前端爬虫~ 由Chrome Headless、LSpider主控、Mysql数据库、RabbitMQ、被动扫描器5部分组合而成。

Knownsec, Inc. 321 Dec 12, 2022
A scalable frontier for web crawlers

Frontera Overview Frontera is a web crawling framework consisting of crawl frontier, and distribution/scaling primitives, allowing to build a large sc

Scrapinghub 1.2k Jan 02, 2023
京东茅台抢购

截止 2021/2/1 日,该项目已无法使用! 京东:约满即止,仅限京东实名认证用户APP端抢购,2月1日10:00开始预约,2月1日12:00开始抢购(京东APP需升级至8.5.6版本及以上) 写在前面 本项目来自 huanghyw - jd_seckill,作者的项目地址我找不到了,找到了再贴上

abee 73 Dec 03, 2022
A web scraper for nomadlist.com, made to avoid website restrictions.

Gypsylist gypsylist.py is a web scraper for nomadlist.com, made to avoid website restrictions. nomadlist.com is a website with a lot of information fo

Alessio Greggi 5 Nov 24, 2022
中国大学生在线 四史自动答题刷分(现仅支持英雄篇)

中国大学生在线 “四史”学习教育竞答 自动答题 刷分 (现仅支持英雄篇,已更新可用) 若对您有所帮助,记得点个Star 🌟 !!! 中国大学生在线 “四史”学习教育竞答 自动答题 刷分 (现仅支持英雄篇,已更新可用) 🥰 🥰 🥰 依赖 本项目依赖的第三方库: requests 在终端执行以下

XWhite 229 Dec 12, 2022
A leetcode scraper to compile all questions in leetcode free tier to text file. pdf also available.

A leetcode scraper to compile all questions in leetcode free tier to text file, pdf also available. if new questions get added, run again to get new questions.

3 Dec 07, 2021
Automated Linkedin bot that will improve your visibility and increase your network.

LinkedinSpider LinkedinSpider is a small project using browser automating to increase your visibility and network of connections on Linkedin. DISCLAIM

Frederik 2 Nov 26, 2021
An utility library to scrape data from TikTok, Instagram, Twitch, Youtube, Twitter or Reddit in one line!

Social Media Scraper An utility library to scrape data from TikTok, Instagram, Twitch, Youtube, Twitter or Reddit in one line! Go to the website » Vie

2 Aug 03, 2022