Scrap the 42 Intranet's elearning videos in a single click

Last update: Oct 27, 2022

Related tags

Web Crawling 42intra_scraper

Overview

42intra_scraper

Scrap the 42 Intranet's elearning videos in a single click.

Why you would want to use it ?

Adjust speed at your convenience. (The intra doesn't allow this)
Working in a remote location where internet is hit or miss ? Download what you need and you'll have it in your computer.
Have a friend that is freeze and can't access the intra's resources ? You can download the videos, compress them and send them via drive.

How to use it:

git clone [email protected]:Dovalich/42intra_scraper.git

pip3 install -r requirements.txt

python3 intra_scraper.py

And then all you have to do is follow the instructions that the program gives you, that is:

enter your 42 intranet username
enter your 42 intranet password
enter the elearning link you want to scrap for example https://elearning.intra.42.fr/tags/38/notions

Here's a short Tutorial gif:

How does it work ?

It's fairly simple.

The program makes a post request to the intranet using your logins (via the requests module).
Once logged-in, it recursively searches for any links that are in the middle of the page (the ones that contain videos).
Once it finds a video link, it downloads it based on the video quality you chose (SD or HD).

Note

As you can see in the code I don't store your user name and password. In fact I only use them once to login. But be careful when using these types of scripts. You should always read the source code before giving away sensitive information.

If you have feedback on the code please let me know! 👨‍🎓

And feel free to use it however you want.

Scrap the 42 Intranet's elearning videos in a single click

Related tags

Overview

42intra_scraper

Why you would want to use it ?

How to use it:

How does it work ?

Note

Owner

Noufel

Crawler job that scrapes comments from social media posts and saves them in a S3 bucket.

A high-level distributed crawling framework.

Scrape puzzle scrambles from csTimer.net

A simple reddit scraper to get memes (only images) from r/ProgrammerHumor.

Scraping Top Repositories for Topics on GitHub,

Unja is a fast & light tool for fetching known URLs from Wayback Machine

simple http & https proxy scraper and checker

Docker containerized Python Flask API that uses selenium to scrape and interact with websites

PaperRobot: a paper crawler that can quickly download numerous papers, facilitating paper studying and management

A simple, configurable and expandable combined shop scraper to minimize the costs of ordering several items

mlscraper: Scrape data from HTML pages automatically with Machine Learning

Scraping news from Ucsal portal with Scrapy.

Scraping and visualising India's real-time COVID-19 data from the MOHFW dataset.

Meme-videos - Scrapes memes and turn them into a video compilations

A module for CME that spiders hashes across the domain with a given hash.

An application that on a given url, crowls a web page and gets all words, sorts and counts them.

Automatically download and crop key information from the arxiv daily paper.

Async Python 3.6+ web scraping micro-framework based on asyncio

Scrapy uses Request and Response objects for crawling web sites.

A spider for Universal Online Judge(UOJ) system, converting problem pages to PDFs.