一个可以可以统计群组用户发言，并且能将聊天内容生成词云的机器人

Last update: Dec 29, 2022

Related tags

Text Processing word_cloud_bot

Overview

当前版本

v2.2

更新维护日志

有问题请加群组反馈

Telegram 交流反馈群组点击加入

演示

配置要求

内存：1G以上

安装方法

使用 Docker 安装

Docker官方安装地址：点击访问

cd /root

# 拉取Redis镜像
docker pull redis

# 创建 entrypoint.sh 入口文件
echo '#! /bin/sh \
cd /root/word_cloud_bot && python3 main.py >> output 2>&1 &
tail -f /dev/null' > /root/entrypoint.sh

# 创建 Dockerfile
wget -O /root/Dockerfile https://github.com/devourbots/word_cloud_bot/raw/master/Dockerfile

# 使用命令查看所有时区
timedatectl list-timezones

找到您所在的时区，例如：
上海 Asia/Shanghai
纽约 America/New_York

# 编辑Dockerfile
vi /root/Dockerfile

# 在第7行修改服务器所属时区，原文件为：
RUN ln -s /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
修改为纽约当地时，修改后：
RUN ln -s /usr/share/zoneinfo/America/New_York /etc/localtime

# 在第10行修改你的机器人TOKEN
修改后：
RUN sed -i '1c TOKEN = "1749418611:AAGcpouQ4EWSDITLQXFozHjMgT_-MsVSmDM"' /root/word_cloud_bot/config.py


# 根据 Dockerfile 创建镜像
docker build . -t world_cloud_bot:latest

# 运行 Redis 镜像，此步在前
docker run -d -p 6379:6379 redis:latest

# 注意！！！
请关闭服务器 6379 端口的外网访问权限！！！如果您的主机提供商提供了安全组策略（阿里云、腾讯云、AWS等等），可以在控制台关闭6379端口。
如果您的主机商不支持自定义安全组，请根据您的发行版系统自行搜索防火墙关闭端口的方式，检测方式在下方。
不要抱有侥幸心理！不要抱有侥幸心理！不要抱有侥幸心理！

# 运行 机器人，此步在后
docker run -d --net=host world_cloud_bot:latest

端口检测工具, 请确保 6379 是关闭状态

使用方法

使用 /start 指令测试机器人与 Redis 数据库的连通情况

使用 /rank 指令主动触发词云任务，在 config.py 里可以设置每个群组每小时主动触发次数的限制

将机器人拉入群组，设置为管理员（受机器人API所限，只有授予管理员权限后，机器人才能接收到所有用户的普通聊天文本，此机器人不需要其他权限，您可以将所有权限关闭）

所有聊天内容每天定时清理，仅用于本地分词，无其他任何用途

将机器人设置为仅自己群组可用

如何编辑 Docker 容器中的文件请自行 Google

如果您不想让别人使用你的机器人，那么可以将 config.py 文件中的 EXCLUSIVE_MODE = 0改为 EXCLUSIVE_MODE = 1

编辑 /root/word_cloud_bot/func.py，在 94 行左右，将自己的群组ID 加入到列表中。这里的EXCLUSIVE_MODE = 1不要改动，注意区分！

例如我两个的群组ID分别为：-127892174935、-471892571924

那么修改后为：

if EXCLUSIVE_MODE == 1 and chat_id not in ["-127892174935", "-471892571924"]:
    print(chat_id + " 为未认证群组，取消入库")
    return

设置 /rank 指令对普通用户开放

编辑 /root/word_cloud_bot/config.py，将 RANK_COMMAND_MODE = 1 改为 RANK_COMMAND_MODE = 0

信息推送密度

默认分别会在当地时间 11:00、18:00、23:30 推送三次数据统计报告，并会在 23:59 清空当日统计数据，如需更密集的数据推送，可以编辑 /root/word_cloud_bot/main.py ，按照示例格式自行增加，相关的 docker 技术操作不再赘述

Releases(v2.5)

v2.5(Jul 30, 2021)
修复了私有模式下无法统计数据的问题，私有群组的添加移动到 config.py 中

Source code(tar.gz)
Source code(zip)
v2.4(Jun 29, 2021)
长度为1的热词将不会被统计

Source code(tar.gz)
Source code(zip)
v2.3(May 23, 2021)
修复热词特殊符号Bug

修复用户展示名字过长问题

Source code(tar.gz)
Source code(zip)
v2.2(May 9, 2021)

/rank 默认只对管理有效，增加开关增加私有模式开关
Source code(tar.gz)
Source code(zip)
v2.1(May 9, 2021)

修复了队列阻塞的Bug

推荐所有用户升级到此版本
Source code(tar.gz)
Source code(zip)
v2.0(May 8, 2021)

添加主动触发任务命令设置任务队列机制支持设置每个群组每小时主动触发次数限制
Source code(tar.gz)
Source code(zip)
v1.1(May 5, 2021)

更新时区自定义方法
Source code(tar.gz)
Source code(zip)
v1.0(May 5, 2021)

Source code(tar.gz)
Source code(zip)

一个可以可以统计群组用户发言，并且能将聊天内容生成词云的机器人

Related tags

Overview

当前版本

更新维护日志

有问题请加群组反馈

演示

配置要求

安装方法

使用 Docker 安装

使用方法

将机器人设置为仅自己群组可用

设置 /rank 指令对普通用户开放

信息推送密度

You might also like...

Releases(v2.5)

v2.5(Jul 30, 2021)

v2.4(Jun 29, 2021)

v2.3(May 23, 2021)

v2.2(May 9, 2021)

v2.1(May 9, 2021)

v2.0(May 8, 2021)

v1.1(May 5, 2021)

v1.0(May 5, 2021)

Owner

机器人总动员

A simple Python module for parsing human names into their individual components

This project is a small tool for processing url-containing texts delivered by HUAWEI Share on Windows.

Python character encoding detector

基于Pytex的数学建模工具,实现将md文件转换成pdf/tex文档的前后端

split Word file by chapter

An anthology of a variety of tools for the Persian language in Python

Format Covid values to ASCII-Table (Only for Germany and Austria)

JSON and CSV data for Swahili dictionary with over 16600+ words

A working (ish) python script to convert text to a gradient.

Build a translation program similar to Google Translate with Python programming language and QT library

PyNews 📰 Simple newsletter made with python 🐍🗞️

A production-ready pipeline for text mining and subject indexing

一款高性能敏感词(非法词/脏字)检测过滤组件，附带繁体简体互换，支持全角半角互换，汉字转拼音，模糊搜索等功能。

Goblin-sim - Procedural fantasy world generator

Fixes mojibake and other glitches in Unicode text, after the fact.

Correcting typos in a word based on the frequency dictionary

Python tool to make adding to your armory spreadsheet armory less of a pain.

A Python package to facilitate research on building and evaluating automated scoring models.

A Python3 script that simulates the user typing a text on their keyboard.

BaseCrack is a tool written in Python that can decode all alphanumeric base encoding schemes.