Web Scraping OLX with Python and Bsoup.

Related tags

Web CrawlingwebScrap
Overview

webScrap

WebScraping first step.

Authors: Paulo, Claudio M.

First steps in Web Scraping. Project carried out for training in Web Scrapping. The export of information to a structured database (Pandas DataFrame) where the information was obtained by making a request() call from pages with known addresses. Find the information in the 'lxml' code formatted by BeautfullSoup, and finally exported in csv format.

  • How to automate the search for related words in OLX ads.

  • Can I use quartile analysis to find the best product at the best price?

Our Plan

  1. Select the list of related words.

  2. Use requests to download the page.

  3. Use BSsoup to format the downloaded page in lxml.

  4. Create a structured database with date and time of posting, ad title, product value, city and neighborhood where it is being advertised.

  5. Filter the database by removing ads whose ad title does not contain the desired words.

  6. Use the percentile and average value metric to find the average price of advertisements by cities (of Brazilian states).

Current progress

Data scraping was carried out and the database was created to analyze the average value by city.

Database formed by information in OLX Brasil website advertisements.

The code is with variables and comments in Portuguese, and the search for advertisements is carried out with words in the Portuguese language.

graphnumber

References

Freecodecamp.org - Web scraping python tutorial

Owner
claudio paulo
Escrevendo programas em Python, básico de C++, Qt. QML, e IDL, para montar gráficos, manipular tabelas, redução de dados de simulações e instrumentos geofísicos
claudio paulo
A simple python web scraper.

Dissec A simple python web scraper. It gets a website and its contents and parses them with the help of bs4. Installation To install the requirements,

11 May 06, 2022
Crawl BookCorpus

These are scripts to reproduce BookCorpus by yourself.

Sosuke Kobayashi 590 Jan 03, 2023
Scraping followers of an instagram account

ScrapInsta A script to scraping data from Instagram Install First of all you can run: pip install scrapinsta After that you need to install these requ

Matheus Kolln 1 Sep 05, 2021
A simple code to fetch comments below an Instagram post and save them to a csv file

fetch_comments A simple code to fetch comments below an Instagram post and save them to a csv file usage First you have to enter your username and pas

2 Jul 14, 2022
Instagram_scrapper - This project allow you to scrape the list of followers, following or both from a public Instagram account, and create a csv or excel file easily.

Instagram_scrapper This project allow you to scrape the list of followers, following or both from a public Instagram account, and create a csv or exce

Lakhdar Belkharroubi 5 Oct 17, 2022
淘宝、天猫半价抢购,抢电视、抢茅台,干死黄牛党

taobao_seckill 淘宝、天猫半价抢购,抢电视、抢茅台,干死黄牛党 依赖 安装chrome浏览器,根据浏览器的版本找到对应的chromedriver下载安装 web版使用说明 1、抢购前需要校准本地时间,然后把需要抢购的商品加入购物车 2、如果要打包成可执行文件,可使用pyinstalle

2k Jan 05, 2023
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster

This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.

IST Research 1.1k Jan 06, 2023
Simple tool to scrape and download cross country ski timings and results from live.skidor.com

LiveSkidorDownload Simple tool to scrape and download cross country ski timings

0 Jan 07, 2022
原神爬虫 抓取原神界面圣遗物信息

原神圣遗物半自动爬虫 说明 直接抓取原神界面中的圣遗物数据 目前只适配了背包页面的抓取 准确率:97.5%(普通通用接口,对 40 件随机圣遗物识别,统计完全正确的数量为 39) 准确率:100%(4k 屏幕,普通通用接口,对 110 件圣遗物识别,统计完全正确的数量为 110) 不排除还有小错误的

hwa 28 Oct 10, 2022
Pelican plugin that adds site search capability

Search: A Plugin for Pelican This plugin generates an index for searching content on a Pelican-powered site. Why would you want this? Static sites are

22 Nov 21, 2022
Instagram profile scrapper with python

IG Profile Scrapper Instagram profile Scrapper Just type the username, and boo! :D Instalation clone this repo to your computer git clone https://gith

its Galih 6 Nov 07, 2022
Web scrapping

Project Setup Table of Contents Project Setup Table of Contents Run project locally Install Requirements Run script Run project locally Install Requir

Charles 3 Feb 04, 2022
CreamySoup - a helper script for automated SourceMod plugin updates management.

CreamySoup/"Creamy SourceMod Updater" (or just soup for short), a helper script for automated SourceMod plugin updates management.

3 Jan 03, 2022
This is python to scrape overview and reviews of companies from Glassdoor.

Data Scraping for Glassdoor This is python to scrape overview and reviews of companies from Glassdoor. Please use it carefully and follow the Terms of

Houping 5 Jun 23, 2022
Auto Join: A GitHub action script to automatically invite everyone to the organization who star your repository.

Auto Invite To The Organization By Star A GitHub Action script to automatically invite everyone to your organization that stars your repository. What

Max Base 11 Dec 11, 2022
Scrape puzzle scrambles from csTimer.net

Scroodle Selenium script to scrape scrambles from csTimer.net csTimer runs locally in your browser, so this doesn't strain the servers any more than i

Jason Nguyen 1 Oct 29, 2021
A webdriver-based script for reserving Tsinghua badminton courts.

AutoReserve A webdriver-based script for reserving badminton courts. 使用说明 下载 chromedriver 选择当前Chrome对应版本 安装 selenium pip install selenium 更改场次、金额信息dat

Payne Zhang 4 Nov 09, 2021
Pro Football Reference Game Data Webscraper

Pro Football Reference Game Data Webscraper Code Copyright Yeetzsche This is a simple Pro Football Reference Webscraper that can either collect all ga

6 Dec 21, 2022
Web scrapper para cotizar articulos

WebScrapper Este web scrapper esta desarrollado en python 3.10.0 para buscar en la pagina de cyber puerta articulos dentro del catalogo. El programa t

Jordan Gaona 1 Oct 27, 2021
Web-scraping - Program that scrapes a website for a collection of quotes, picks one at random and displays it

web-scraping Program that scrapes a website for a collection of quotes, picks on

Manvir Mann 1 Jan 07, 2022