Skip to content

fjg00/Facebook-Group-Post-Scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Notes

  • The scraper should only be used for educational purposes
  • Kindly refrain from scraping sensitive or private information
  • Ask for consent from the group adminstrator and/or group members before running any code
  • I am not responsible for any misuse of the code in any shape or form

Facebook Group Scraping Using Beautiful Soup & Selenium

Extract Facebook group posts that are related to a specific topic and write them to a .json file. This project was created in order to gather data needed to build a chatbot for a university's website.

Input

  • User's Credentials
  • Facebook Group URL
  • Number of Scrolls
    • Number of posts you want to collect
  • Directory of the Chromedriver
  • Optional: Specific topic to be searched

What the Scraper Does

  • Logs into Facebook using the User's Credentials
  • Enters the group specified by the User
  • Searches for the topic
  • Extracts all posts & their comments

Scraper Output

.json file that includes:

  • Each post
  • The comments replying to it

Format of file:

{ 
   "tag": "Topic 1",
   "patterns":  [ "Post text" ],
   "responses": [ "Comment 1", 
        "Comment 2",
        "Comment 3"  
    ]
}

Setup Requirements

  1. Make sure chrome is installed
  2. Install Chromedriver and place it in the same directory as the file
  3. Install Selenium Library (for anaconda cmd use: conda install -c anaconda selenium)
  4. Install Beautiful Soup (for anaconda cmd use: conda install -c anaconda beautifulsoup4)
  5. Enter inputs required by the code
  6. Run the code

Updates

  • Scrape comments found in "view more comments"
  • Add a file for inputs only
  • Add comments to the code
  • Add an option to scrape the general group discussions and not specific topics

About

Extract Facebook group posts and write them to a .json file without using Facebook API.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages