Automatically detect changes made to the official Telegram sites.

Overview

🕷 Telegram Web Crawler

This project is developed to automatically detect changes made to the official Telegram sites. This is necessary for anticipating future updates and other things (new vacancies, API updates, etc).

Name Commits Status
Site updates tracker Commits Fetch new content of tracked links to files
Site links tracker Commits Generate or update list of tracked links
  • passing – new changes
  • failing – no changes

You should to subscribe to channel with alerts to stay updated. Copy of Telegram websites stored here.

GitHub pretty diff

How it works

  1. Link crawling runs as often as possible. Starts crawling from the home page of the site. Detects relative and absolute sub links and recursively repeats the operation. Writes a list of unique links for future content comparison. Additionally, there is the ability to add links by hand to help the script find more hidden (links to which no one refers) links. To manage exceptions, there is a system of rules for the link crawler.

  2. Content crawling is launched as often as possible and uses the existing list of links collected in step 1. Going through the base it gets contains and builds a system of subfolders and files. Removes all dynamic content from files.

  3. Using of GitHub Actions. Works without own servers. You can just fork this repository and own tracker system by yourself. Workflows launch scripts and commit changes. All file changes are tracked by the GIT and beautifully displayed on the GitHub. GitHub Actions should be built correctly only if there are changes on the Telegram website. Otherwise, the workflow should fail. If build was successful, we can send notifications to Telegram channel and so on.

FAQ

Q: How often is "as often as possible"?

A: TLTR: content update action runs every ~10 minutes. More info:

Q: Why there is 2 separated crawl scripts instead of one?

A: Because the previous idea was to update tracked links once at hour. It was so comfortably to use separated scripts and workflows. After Telegram 7.7 update, I realised that find new blog posts so slowly is bad idea.

Q: Why alert for sending alerts have while loop?

A: Because GitHub API doesn't return information about commit immediately after push to repository. Therefore, script are waiting for information to appear...

Q: Why are you using GitHab Personal Access Token in action/checkout workflow`s step?

A: To have ability to trigger other workflows by on push trigger. More info:

Q: Why are you using GitHab PAT in make_and_send_alert.py?

A: To increase limits of GitHub API.

TODO list

  • add storing history of content using hashes;
  • add storing hashes of image, svg, video.

Example of link crawler rules configuration

CRAWL_RULES = {
    # every rule is regex
    # empty string means match any url
    # allow rules with higher priority than deny
    'translations.telegram.org': {
        'allow': {
            r'^[^/]*$',  # root
            r'org/[^/]*/$',  # 1 lvl sub
            r'/en/[a-z_]+/$'  # 1 lvl after /en/
        },
        'deny': {
            '',  # all
        }
    },
    'bugs.telegram.org': {
        'deny': {
            '',    # deny all sub domain
        },
    },
}

Current hidden urls list

HIDDEN_URLS = {
    # 'corefork.telegram.org', # disabled

    'telegram.org/privacy/gmailbot',
    'telegram.org/tos',
    'telegram.org/tour',
    'telegram.org/evolution',

    'desktop.telegram.org/changelog',
}

License

Licensed under the MIT License.

Owner
Il'ya
Telegram: https://t.me/MarshalX
Il'ya
To dynamically change the split direction in I3/Sway so as to split new windows automatically based on the width and height of the focused window

To dynamically change the split direction in I3/Sway so as to split new windows automatically based on the width and height of the focused window Insp

Ritin George 6 Mar 11, 2022
Instagram GiftShop Scam Killer

Instagram GiftShop Scam Killer A basic tool for Windows which kills acess to any giftshop scam from Instagram. Protect your Instagram account from the

1 Mar 31, 2022
The Simple Google Colab Notebook to Download Files from Direct Link to Google Drive with custom name and bulk link support.

Direct Link to Google Drive (Advanced! 🔥 ) The Most Advanced yet Simple Google Colab Notebook to Download Files from Direct Link to Google Drive. 🆕

Dr.Caduceus 14 Jul 26, 2022
Github repository started notify 💕

Github repository started notify 💕

4 Aug 06, 2022
A CLI tool to transfer, sync, and backup playlists on music streaming services

unitunes A command-line interface tool to manage playlists across music streaming services. Introduction unitunes manages playlists across streaming s

Victor Tao 50 Jan 07, 2023
A Discord webhook spammer made in Python.

A Python made Discord webhook spammer usually used for token loggers to spam them/delete them original by cattyn I only made it so u can change the avatar to whatever u want instead of it being hardc

notperry1234567890 15 Dec 15, 2021
Infrastructure template and Jupyter notebooks for running RoseTTAFold on AWS Batch.

AWS RoseTTAFold Infrastructure template and Jupyter notebooks for running RoseTTAFold on AWS Batch. Overview Proteins are large biomolecules that play

AWS Samples 20 May 10, 2022
A Python module for communicating with the Twilio API and generating TwiML.

twilio-python The default branch name for this repository has been changed to main as of 07/27/2020. Documentation The documentation for the Twilio AP

Twilio 1.6k Jan 05, 2023
A battle-tested Django 2.1 project template with configurations for AWS, Heroku, App Engine, and Docker.

For information on how to use this project template, check out the wiki. {{ project_name }} Table of Contents Requirements Local Setup Local Developme

Lionheart Software 64 Jun 15, 2022
One destination for all the developer's learning resources.

DevResources One destination for all the developer's learning resources. Find all of your learning resources under one roof and add your own. Live ✨ Y

Gaurav Sharma 33 Oct 21, 2022
Eva Maria Telegram Bot

Eva Maria Bot Features Auto Filter Manuel Filter IMDB Admin Commands Broadcast Index IMDB search Inline Search Random pics ids and User info Stats, Us

Eva Maria TG 477 Dec 31, 2022
Telegram bot that sends new offers from otomoto.pl

Telegram bot that sends new offers under certain filters from otomoto.pl How to use this bot? Install requirements with pip install -r requirements.tx

Mikhail Zanka 1 Feb 14, 2022
This is a TG Video Compress BoT. Product by BINARY Tech

🌀 Video Compressor Bot Product by BINARY Tech Deploy to Heroku The Hard Way virtualenv -p python3 VENV . ./VENV/bin/activate pip install -r requireme

1 Jan 04, 2022
Please Do Not Throw Sausage Pizza Away - Side Scrolling Up The OSI Stack

Please Do Not Throw Sausage Pizza Away - Side Scrolling Up The OSI Stack

John Capobianco 2 Jan 25, 2022
Python SDK for interacting with the Frame.io API.

python-frameio-client Frame.io Frame.io is a cloud-based collaboration hub that allows video professionals to share files, comment on clips real-time,

Frame.io 37 Dec 21, 2022
List of twitch bots n bigots

This is a collection of bot account names NamelistMASTER contains all the names we reccomend you ban in your channel Sometimes people get on that list

62 Sep 05, 2021
AutomaTik is an automation system for MikroTik devices with simplicity and security in mind.

AutomaTik Installation AutomaTik is an automation system for MikroTik devices with simplicity and security in mind. Winbox is the main tool for MikroT

Osman Kazdal 4 Dec 05, 2022
Watches your earnings on EarnApp and notifies you when you earned balance or received an payout.

EarnApp-Earning-Monitor Watches your earnings on EarnApp and notifies you when you earned balance or received an payout. Installation Install Python3

Yariya 21 Oct 17, 2022
Asynchronous Guilded API wrapper for Python

Welcome to guilded.py, a discord.py-esque asynchronous Python wrapper for the Guilded API. If you know discord.py, you know guilded.py. Documentation

shay 115 Dec 30, 2022
1.本项目采用Python Flask框架开发提供(应用管理,实例管理,Ansible管理,LDAP管理等相关功能)

op-devops-api 1.本项目采用Python Flask框架开发提供(应用管理,实例管理,Ansible管理,LDAP管理等相关功能) 后端项目配套前端项目为:op-devops-ui jenkinsManager 一.插件python-jenkins bug修复 (1).插件版本 pyt

3 Nov 12, 2021