Facebook Group Scraping Using Beautiful Soup & Selenium

Last update: Aug 12, 2022

Overview

Notes

The scraper should only be used for educational purposes
Kindly refrain from scraping sensitive or private information
It is highly recommended to scrape public (and not private) groups
Ask for consent from the group adminstrator and/or group members before running any code
I am not responsible for any misuse of the code in any shape or form

Facebook Group Scraping Using Beautiful Soup & Selenium

Extract Facebook group posts that are related to a specific topic and write them to a .json file. This project was created in order to gather data needed to build a chatbot for a university's website.

Input

User's Credentials
Facebook Group URL
Number of Scrolls
- Number of posts you want to collect
Directory of the Chromedriver
Optional: Specific topic to be searched

What the Scraper Does

Logs into Facebook using the User's Credentials
Enters the group specified by the User
Searches for the topic
Extracts all posts & their comments

Scraper Output

.json file that includes:

Each post
The comments replying to it

Format of file:

{ 
   "tag": "Topic 1",
   "patterns":  [ "Post text" ],
   "responses": [ "Comment 1", 
        "Comment 2",
        "Comment 3"  
    ]
}

Setup Requirements

Make sure chrome is installed
Install Chromedriver and place it in the same directory as the file
Enter inputs required by the code
Run the code

Updates

Scrape comments found in "view more comments"
Add a file for inputs only
Add comments to the code
Add an option to scrape the general group discussions and not specific topics

Facebook Group Scraping Using Beautiful Soup & Selenium

Related tags

Overview

Notes

Facebook Group Scraping Using Beautiful Soup & Selenium

Input

What the Scraper Does

Scraper Output

Format of file:

Setup Requirements

Updates

Owner

Fatima Ghadieh

A simple python script to fetch the latest covid info

Async Python 3.6+ web scraping micro-framework based on asyncio

Python script that reads Aliexpress offers urls from a Excel filename (.csv) and post then in a Telegram channel using a bot

Amazon web scraping using Scrapy Framework

AssistScraper - program for /r/nba to use to find list of all players a player assisted and how many assists each player recieved

Find thumbnails and original images from URL or HTML file.

Scrapy, a fast high-level web crawling & scraping framework for Python.

download NCERT books using scrapy

This tool can be used to extract information from any website

simple http & https proxy scraper and checker

a small library for extracting rich content from urls

Simply scrape / download all the media from an fansly account.

Here I provide the source code for doing web scraping using the python library, it is Selenium.

A dead simple crawler to get books information from Douban.

This app will let you continuously scrape certain parts of LeasePlan and extract data of cars becoming available for lease.

python+selenium实现的web端自动打卡 + 每日邮件发送 + 金山词霸每日一句 + 毒鸡汤（从2月份稳定运行至今）

Crawl the information of a given keyword on Google search engine

crypto currency scraping

Simple tool to scrape and download cross country ski timings and results from live.skidor.com

Scrap-mtg-top-8 - A top 8 mtg scraper using python

Facebook Group Scraping Using Beautiful Soup & Selenium

Related tags

Overview

Notes

Facebook Group Scraping Using Beautiful Soup & Selenium

Input

What the Scraper Does

Scraper Output

Format of file:

Setup Requirements

Updates

Owner

Fatima Ghadieh

A simple python script to fetch the latest covid info

Async Python 3.6+ web scraping micro-framework based on asyncio

Python script that reads Aliexpress offers urls from a Excel filename (.csv) and post then in a Telegram channel using a bot

Amazon web scraping using Scrapy Framework

AssistScraper - program for /r/nba to use to find list of all players a player assisted and how many assists each player recieved

Find thumbnails and original images from URL or HTML file.

Scrapy, a fast high-level web crawling & scraping framework for Python.

download NCERT books using scrapy

This tool can be used to extract information from any website

simple http & https proxy scraper and checker

a small library for extracting rich content from urls

Simply scrape / download all the media from an fansly account.

Here I provide the source code for doing web scraping using the python library, it is Selenium.

A dead simple crawler to get books information from Douban.

This app will let you continuously scrape certain parts of LeasePlan and extract data of cars becoming available for lease.

python+selenium实现的web端自动打卡 + 每日邮件发送 + 金山词霸 每日一句 + 毒鸡汤（从2月份稳定运行至今）

Crawl the information of a given keyword on Google search engine

crypto currency scraping

Simple tool to scrape and download cross country ski timings and results from live.skidor.com

Scrap-mtg-top-8 - A top 8 mtg scraper using python

python+selenium实现的web端自动打卡 + 每日邮件发送 + 金山词霸每日一句 + 毒鸡汤（从2月份稳定运行至今）