Scrapping Connections' info on Linkedin

Last update: Feb 11, 2022

Overview

Scrap It!

! Disclaimer:

THIS CODE HAS BEEN IMPLEMENTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE INTERVIEW PROCESS OF MCI.IR AND INTERVIEWEES WERE SUPPOSED TO PUSH THE CODE ON THEIR GITHUB. CONTACT ME TO REMOVE THIS REPOSITORY, IN CASE IT IS AGAINST YOUR TOS.
IF ANY CONNECTION IS NOT OK TO THEIR CONTACT INFO BE HERE, CONTACT ME TO REMOVE THEM ASAP.

Functionalities:

This script automatically:

opens your Linkedin profile
accesses your connections page
crawls the page for grabbing their profile links
scraps each person's information and dumps it to Sqlite db
and simultaneously logs all necessary level of info into Linkedin.log

DataFlowDiagram

Enlisted desing patterns are (but not limited to):

Creator
Low Coupling
High Cohesion
Indirection
Modularization
Information Expert

Log/DB files:

Further develepments notes:

Check out other DBs that supports multithreading which anable us dumpping all information rows at once
change IP per request (You can find its code on my "Social Media Computing course" repository)
Sometimes you need to scroll down manually when "connection" page is being loaded. You can add one line code to scroll down for you.

References:

https://www.linkedin.com/pulse/how-easy-scraping-data-from-linkedin-profiles-david-craven

https://www.geeksforgeeks.org/scrape-linkedin-using-selenium-and-beautiful-soup-in-python/

https://stackoverflow.com/questions/28883769/remove-odd-indexed-elements-from-list-in-python#:~:text=Fun%20fact%3A%20to%20remove%20all,remove(x)%20.

https://stackoverflow.com/questions/34759787/fetch-all-href-link-using-selenium-in-python

https://www.tutorialspoint.com/fetch-all-href-link-using-selenium-in-python

https://stackoverflow.com/questions/64717302/deprecationwarning-executable-path-has-been-deprecated-selenium-python

https://chromedriver.chromium.org/home

https://www.youtube.com/watch?v=-ARI4Cz-awo

Scrapping Connections' info on Linkedin

Related tags

Overview

Scrap It!

Functionalities:

DataFlowDiagram

Enlisted desing patterns are (but not limited to):

Log/DB files:

Further develepments notes:

References:

Owner

MohammadReza Ardestani

京东抢茅台，秒杀成功很多次讨论，天猫抢购，赚钱交流等。

A Powerful Spider(Web Crawler) System in Python.

tweet random sand cat pictures

A dead simple crawler to get books information from Douban.

This is a webscraper for a specific website

A tool can scrape product in aliexpress: Title, Price, and URL Product.

京东茅台抢购 2021年4月最新版

Web Scraping COVID 19 Meta Portal with Python

Visual scraping for Scrapy

An utility library to scrape data from TikTok, Instagram, Twitch, Youtube, Twitter or Reddit in one line!

IGLS - Instagram Like Scraper CLI tool

Grab the changelog from releases on Github

Automatically scrapes all menu items from the Taco Bell website

A Python Covid-19 cases tracker that scrapes data off the web and presents the number of Cases, Recovered Cases, and Deaths that occurred because of the pandemic.

Instagram_scrapper - This project allow you to scrape the list of followers, following or both from a public Instagram account, and create a csv or excel file easily.

Rottentomatoes, Goodreads and IMDB sites crawler. Semantic Web final project.

Library to scrape and clean web pages to create massive datasets.

Ebay Webscraper for Getting Average Product Price

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

robobrowser - A simple, Pythonic library for browsing the web without a standalone web browser.