An automated, headless YouTube Watcher and Scraper

Last update: Oct 18, 2022

Overview

An automated, headless YouTube Watcher and Scraper

Authors: Christian C., Moritz M., Luca S.
Related Projects: YouTube Uploader, Twitch Compilation Creator, Neural Networks

About

Searches YouTube, queries recommended videos and watches them. All fully automated and anonymised through the Tor network. The project consists of two independently usable components, the YouTube automation written in Python and the dockerized Tor Browser.

This project is for educational purposes only. Using Tor to watch YouTube videos is strongly discouraged, especially for Botting purposes. Please inform yourself about the Tor network, before using it extensively.

Setup

YouTube Automation

This project requires Poetry to install the required dependencies. Check out this link to install Poetry on your operating system.

Make sure you have installed Python 3.8! Otherwise Step 3 will let you know that you have no compatible Python version installed.

Clone/Download this repository
Navigate to the root of the repository
Run poetry install to create a virtual environment with Poetry
Either run the dockerized Browser with docker-compose up, install geckodriver for a local Firefox or ChromeDriver for Chromium. Ensure that geckodriver/ChromeDriver are in a location in your $PATH.
Run poetry run python main.py to run the program. Alternatively you can run poetry shell followed by python main.py. By default this connects to the dockerized Browser. To automate a different Browser use the --browser [chrome/firefox] command line option.

Dockerized Tor Browser

Running the Container requires Docker and docker-compose.

Clone/Download this repository
Navigate to the root of the repository
Run docker-compose up. The image will be built automatically before startup.
Selenium can now connect to the browser via port 4444. In Python the connection can be established with the following command.
```
driver = webdriver.Remote(
    command_executor="http://127.0.0.1:4444/wd/hub",
    desired_capabilities=options,
)
```
See main.py for more information.

Run Parameters

All of these parameters are optional and a default value will be used if they are not defined. You can also get these definitions by running main.py --help

usage: main.py [-h] [-B {docker,chrome,firefox}] [-t] [--disable-tor] -s SEARCH_TERMS [-c CHANNEL_URL]

optional arguments:
  -h, --help            show this help message and exit
  -B {docker,chrome,firefox}, --browser {docker,chrome,firefox}
                        Select the driver/browser to use for executing the script.
  -t, --enable-tor      Enables Tor usage by connecting to a proxy on localhost:9050. Only usable with the docker
                        executor.
  --disable-tor         Disables the Tor proxy.
  -s SEARCH_TERMS, --search-terms SEARCH_TERMS
                        This argument declares a list of search terms which get viewed.
  -c CHANNEL_URL, --channel-url CHANNEL_URL
                        Channel URL if not declared it uses Golden Gorillas channel URL as default.

An automated, headless YouTube Watcher and Scraper

Related tags

Overview

About

Setup

YouTube Automation

Dockerized Tor Browser

Run Parameters

Owner

Bigdata - This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster

Crawl the information of a given keyword on Google search engine

Parse feeds in Python

一个m3u8视频流下载脚本

Scrapes proxies and saves them to a text file

Minimal set of tools to conduct stealthy scraping.

河南工业大学完美校园自动校外打卡

薅薅乐 - JD 测试脚本

京东茅台抢购最新优化版本，京东秒杀，添加误差时间调整，优化了茅台抢购进程队列

Scrapping Connections' info on Linkedin

Async Python 3.6+ web scraping micro-framework based on asyncio

Transistor, a Python web scraping framework for intelligent use cases.

A package that provides you Latest Cyber/Hacker News from website using Web-Scraping.

Using Python and Pushshift.io to Track stocks on the WallStreetBets subreddit

Web crawling framework based on asyncio.

Ebay Webscraper for Getting Average Product Price

Scraping and visualising India's real-time COVID-19 data from the MOHFW dataset.

Semplice scraper realizzato in Python tramite la libreria BeautifulSoup

fork huanghyw/jd_seckill

Web Content Retrieval for Humans™

An automated, headless YouTube Watcher and Scraper

Related tags

Overview

About

Setup

YouTube Automation

Dockerized Tor Browser

Run Parameters

Owner

Bigdata - This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster

Crawl the information of a given keyword on Google search engine

Parse feeds in Python

一个m3u8视频流下载脚本

Scrapes proxies and saves them to a text file

Minimal set of tools to conduct stealthy scraping.

河南工业大学 完美校园 自动校外打卡

薅薅乐 - JD 测试脚本

京东茅台抢购最新优化版本，京东秒杀，添加误差时间调整，优化了茅台抢购进程队列

Scrapping Connections' info on Linkedin

Async Python 3.6+ web scraping micro-framework based on asyncio

Transistor, a Python web scraping framework for intelligent use cases.

A package that provides you Latest Cyber/Hacker News from website using Web-Scraping.

Using Python and Pushshift.io to Track stocks on the WallStreetBets subreddit

Web crawling framework based on asyncio.

Ebay Webscraper for Getting Average Product Price

Scraping and visualising India's real-time COVID-19 data from the MOHFW dataset.

Semplice scraper realizzato in Python tramite la libreria BeautifulSoup

fork huanghyw/jd_seckill

Web Content Retrieval for Humans™

河南工业大学完美校园自动校外打卡