An automated, headless YouTube Watcher and Scraper

Overview

Logo

MIT License Code style: black
Platform: YouTube Uses Docker Automation supporting Firefox and Chrome
Uses Tor. And an emoji of an onion, get it? Firefox supported Chrome supported

An automated, headless YouTube Watcher and Scraper

Authors: Christian C., Moritz M., Luca S.
Related Projects: YouTube Uploader, Twitch Compilation Creator, Neural Networks


About

Searches YouTube, queries recommended videos and watches them. All fully automated and anonymised through the Tor network. The project consists of two independently usable components, the YouTube automation written in Python and the dockerized Tor Browser.

This project is for educational purposes only. Using Tor to watch YouTube videos is strongly discouraged, especially for Botting purposes. Please inform yourself about the Tor network, before using it extensively.

Setup

YouTube Automation

This project requires Poetry to install the required dependencies. Check out this link to install Poetry on your operating system.

Make sure you have installed Python 3.8! Otherwise Step 3 will let you know that you have no compatible Python version installed.

  1. Clone/Download this repository
  2. Navigate to the root of the repository
  3. Run poetry install to create a virtual environment with Poetry
  4. Either run the dockerized Browser with docker-compose up, install geckodriver for a local Firefox or ChromeDriver for Chromium. Ensure that geckodriver/ChromeDriver are in a location in your $PATH.
  5. Run poetry run python main.py to run the program. Alternatively you can run poetry shell followed by python main.py. By default this connects to the dockerized Browser. To automate a different Browser use the --browser [chrome/firefox] command line option.

Dockerized Tor Browser

Running the Container requires Docker and docker-compose.

  1. Clone/Download this repository

  2. Navigate to the root of the repository

  3. Run docker-compose up. The image will be built automatically before startup.

  4. Selenium can now connect to the browser via port 4444. In Python the connection can be established with the following command.

    driver = webdriver.Remote(
        command_executor="http://127.0.0.1:4444/wd/hub",
        desired_capabilities=options,
    )

    See main.py for more information.

Run Parameters

All of these parameters are optional and a default value will be used if they are not defined. You can also get these definitions by running main.py --help

usage: main.py [-h] [-B {docker,chrome,firefox}] [-t] [--disable-tor] -s SEARCH_TERMS [-c CHANNEL_URL]

optional arguments:
  -h, --help            show this help message and exit
  -B {docker,chrome,firefox}, --browser {docker,chrome,firefox}
                        Select the driver/browser to use for executing the script.
  -t, --enable-tor      Enables Tor usage by connecting to a proxy on localhost:9050. Only usable with the docker
                        executor.
  --disable-tor         Disables the Tor proxy.
  -s SEARCH_TERMS, --search-terms SEARCH_TERMS
                        This argument declares a list of search terms which get viewed.
  -c CHANNEL_URL, --channel-url CHANNEL_URL
                        Channel URL if not declared it uses Golden Gorillas channel URL as default.
Google Developer Profile Badge Scraper

Google Developer Profile Badge Scraper It is a Google Developer Profile Web Scraper which scrapes for specific badges in a user's Google Developer Pro

Hemant Sachdeva 2 Feb 22, 2022
Scrapes Every Email Address of Every Society in Every University

society-email-scrape Site Live at https://kcsoc.github.io/society-email-scrape/ How to automatically generate new data Go to unis.yml Add your uni Cre

Krishna Consciousness Society 18 Dec 14, 2022
UdemyBot - A Simple Udemy Free Courses Scrapper

UdemyBot - A Simple Udemy Free Courses Scrapper

Gautam Kumar 112 Nov 12, 2022
Poolbooru gelscraper - a simple python script for scraping images off gelbooru pools.

poolbooru_gelscraper a simple python script for scraping images off gelbooru pools. modules required:requests_html, and os by default saves files with

savantshuia 1 Jan 02, 2022
Pelican plugin that adds site search capability

Search: A Plugin for Pelican This plugin generates an index for searching content on a Pelican-powered site. Why would you want this? Static sites are

22 Nov 21, 2022
Audio media crawler for lbry.

Audio media crawler for lbry. Requirements Python 3.8 Poetry 1.1.7 Elasticsearch 7.14.0 Lbry-sdk 0.99.0 Development This project uses poetry as a depe

Hound.fm 4 Dec 03, 2022
Scrapping Connections' info on Linkedin

Scrapping Connections' info on Linkedin

MohammadReza Ardestani 1 Feb 11, 2022
Nekopoi scraper using python3

Features Scrap from url Todo [+] Search by genre [+] Search by query [+] Scrap from homepage Example # Hentai Scraper from nekopoi import Hent

MhankBarBar 9 Apr 06, 2022
Web Content Retrieval for Humans™

Lassie Lassie is a Python library for retrieving basic content from websites. Usage import lassie lassie.fetch('http://www.youtube.com/watch?v

Mike Helmick 570 Dec 19, 2022
爱奇艺会员,腾讯视频,哔哩哔哩,百度,各类签到

My-Actions 个人收集并适配Github Actions的各类签到大杂烩 不要fork了 ⭐️ star就行 使用方式 新建仓库并同步代码 点击Settings - Secrets - 点击绿色按钮 (如无绿色按钮说明已激活。直接到下一步。) 新增 new secret 并设置 Secr

280 Dec 30, 2022
A simple Discord scraper for discord bots

A simple Discord scraper for discord bots. That includes sending an guild members ids to an file, Mass inviter for joining servers your bot is in and Fetching all the servers of the bot (w/MemberCoun

3zg 1 Jan 06, 2022
Scrap-mtg-top-8 - A top 8 mtg scraper using python

Scrap-mtg-top-8 - A top 8 mtg scraper using python

1 Jan 24, 2022
A tool for scraping and organizing data from NewsBank API searches

nbscraper Overview This simple tool automates the process of copying, pasting, and organizing data from NewsBank API searches. Curerntly, nbscrape onl

0 Jun 17, 2021
Linkedin webscraping - Linkedin web scraping with python

linkedin_webscraping This is the first step of a full project called "LinkedIn J

Pedro Dib 4 Apr 24, 2022
抖音批量下载用户所有无水印视频

Douyincrawler 抖音批量下载用户所有无水印视频 Run 安装python3, 安装依赖

28 Dec 08, 2022
Explore scraping with BeautifulSoup!

beautifulsoup-scrape Explore scraping with BeautifulSoup! Part One: Start from Shakespeare As my professor is a poet (yes, and he teaches me data and

Chuqin 2 Oct 05, 2022
京东抢茅台,秒杀成功很多次讨论,天猫抢购,赚钱交流等。

Jd_Seckill 特别声明: 请添加个人微信:19972009719 进群交流讨论 目前群里很多人抢到【扫描微信添加群就好,满200关闭群,有喜欢薅信用卡羊毛的也可以找我交流】 本仓库发布的jd_seckill项目中涉及的任何脚本,仅用于测试和学习研究,禁止用于商业用途,不能保证其合法性,准确性

50 Jan 05, 2023
Scrapy-soccer-games - Scraping information about soccer games from a few websites

scrapy-soccer-games Esse projeto tem por finalidade pegar informação de tabela d

Caio Alves 2 Jul 20, 2022
A high-level distributed crawling framework.

Cola: high-level distributed crawling framework Overview Cola is a high-level distributed crawling framework, used to crawl pages and extract structur

Xuye (Chris) Qin 1.5k Jan 04, 2023
This repo has the source code for the crawler and data crawled from auto-data.net

This repo contains the source code for crawler and crawled data of cars specifications from autodata. The data has roughly 45k cars

Tô Đức Anh 5 Nov 22, 2022