Web-Scraping using Selenium

What is the need of Selenium?

Some websites don't like to be scrapped and in that case you need to disguise your webscraping bot as a Human Being.

What is locator or css selector or xpath?

Locator can be termed as an address that identifies a web element uniquely within the webpage. Locators are the HTML properties of a web element which tells the Selenium about the web element it need to perform action on.

There is a diverse range of web elements. The most common amongst them are:

Text box Button Drop Down Hyperlink Check Box Radio Button

Types of Locators in Selenium

Photo Credit - www.softwaretestinghelp.com

XPATH

Xpath is used to locate a web element based on its XML path. XML stands for Extensible Markup Language and is used to store, organize and transport arbitrary data. It stores data in a key-value pair which is very much similar to HTML tags. Both being mark up languages and since they fall under the same umbrella, xpath can be used to locate HTML elements.

The fundamental behind locating elements using Xpath is the traversing between various elements across the entire page and thus enabling a user to find an element with the reference of another element.

CSS-Selector

CSS Selector is combination of an element selector and a selector value which identifies the web element within a web page. The composite of element selector and selector value is known as Selector Pattern.

Photo-Credit - www.softwaretestinghelp.com

Primitive types of CSS Selector

Different Types of CSS Selector

Web-Scraping using Selenium Master

Related tags

Overview

Web-Scraping using Selenium

What is the need of Selenium?

What is locator or css selector or xpath?

Owner

Md Rashidul Islam

Example of scraping a paginated API endpoint and dumping the data into a DB

A powerful annex BUBT, BUBT Soft, and BUBT website scraping script.

Simple tool to scrape and download cross country ski timings and results from live.skidor.com

An Web Scraping API for MDL(My Drama List) for Python.

Newsscraper - A simple Python 3 module to get crypto or news articles and their content from various RSS feeds.

Collection of code files to scrap different kinds of websites.

Web crawling framework based on asyncio.

A web scraper that exports your entire WhatsApp chat history.

An helper library to scrape data from TikTok in one line, using the Influencer Hunters APIs.

A list of Python Bots used to extract data from several websites

The first public repository that provides free BUBT website scraping API script on Github.

Scrape puzzle scrambles from csTimer.net

Find papers by keywords and venues. Then download it automatically

Comment Webpage Screenshot is a GitHub Action that captures screenshots of web pages and HTML files located in the repository

PS5 bot to find a console in france for chrismas 🎄🎅🏻 NOT FOR SCALPERS

Web Scraping Framework

此脚本为 python 脚本,实现原理为利用 selenium 定位相关元素,再配合点击事件完成浏览器的自动化.

Automatically scrapes all menu items from the Taco Bell website

A Python module to bypass Cloudflare's anti-bot page.

This is a web crawler that works on employ email data by gmane.org and visualizes it in different ways.