Facebook Group Scraping Using Beautiful Soup & Selenium

Last update: Aug 12, 2022

Overview

Notes

The scraper should only be used for educational purposes
Kindly refrain from scraping sensitive or private information
It is highly recommended to scrape public (and not private) groups
Ask for consent from the group adminstrator and/or group members before running any code
I am not responsible for any misuse of the code in any shape or form

Facebook Group Scraping Using Beautiful Soup & Selenium

Extract Facebook group posts that are related to a specific topic and write them to a .json file. This project was created in order to gather data needed to build a chatbot for a university's website.

Input

User's Credentials
Facebook Group URL
Number of Scrolls
- Number of posts you want to collect
Directory of the Chromedriver
Optional: Specific topic to be searched

What the Scraper Does

Logs into Facebook using the User's Credentials
Enters the group specified by the User
Searches for the topic
Extracts all posts & their comments

Scraper Output

.json file that includes:

Each post
The comments replying to it

Format of file:

{ 
   "tag": "Topic 1",
   "patterns":  [ "Post text" ],
   "responses": [ "Comment 1", 
        "Comment 2",
        "Comment 3"  
    ]
}

Setup Requirements

Make sure chrome is installed
Install Chromedriver and place it in the same directory as the file
Enter inputs required by the code
Run the code

Updates

Scrape comments found in "view more comments"
Add a file for inputs only
Add comments to the code
Add an option to scrape the general group discussions and not specific topics

Facebook Group Scraping Using Beautiful Soup & Selenium

Related tags

Overview

Notes

Facebook Group Scraping Using Beautiful Soup & Selenium

Input

What the Scraper Does

Scraper Output

Format of file:

Setup Requirements

Updates

Owner

Fatima Ghadieh

A scalable frontier for web crawlers

A modern CSS selector implementation for BeautifulSoup

Ebay Webscraper for Getting Average Product Price

京东茅台抢购

Screenhook is a script that captures an image of a web page and send it to a discord webhook.

Google Developer Profile Badge Scraper

Web scrapping tool written in python3, using regex, to get CVEs, Source and URLs.

爱奇艺会员,腾讯视频,哔哩哔哩,百度,各类签到

Using Selenium with Python to Web Scrap Popular Youtube Tech Channels.

A web service for scanning media hosted by a Matrix media repository

tweet random sand cat pictures

Web Scraping Practica With Python

A package that provides you Latest Cyber/Hacker News from website using Web-Scraping.

A Python Covid-19 cases tracker that scrapes data off the web and presents the number of Cases, Recovered Cases, and Deaths that occurred because of the pandemic.

A Web Scraper built with beautiful soup, that fetches udemy course information. Get udemy course information and convert it to json, csv or xml file

mlscraper: Scrape data from HTML pages automatically with Machine Learning

👁️ Tool for Data Extraction and Web Requests.

PaperRobot: a paper crawler that can quickly download numerous papers, facilitating paper studying and management

Quick Project made to help scrape Lexile and Atos(AR) levels from ISBN

An helper library to scrape data from TikTok in one line, using the Influencer Hunters APIs.