DocuMiner
A production-ready pipeline for text mining and subject indexing
Want to Contribute?
More code and documentation coming soon.
Authors
Open Source Club
A production-ready pipeline for text mining and subject indexing
More code and documentation coming soon.
Open Source Club
colormate Python script text formatting package What is colormate? colormate is a python library that lets you add text formatting to your scripts, it
google-translate Build a translation program similar to Google Translate with Python programming language and QT library Different parts of the progra
VenSipher VenSipher is a free medium through which text can be enciphered. It can convert any text into an unrecognizable secret text that can only be
Summarize Written by: Ling Li Ya This is a text summarizing tool written in Python. User Guide Some things to note: The application is accessible here
FuzzyWuzzy Fuzzy string matching like a boss. It uses Levenshtein Distance to calculate the differences between sequences in a simple-to-use package.
A query extract python package
Python Slugify A Python slugify application that handles unicode. Overview Best attempt to create slugs from unicode strings while keeping it DRY. Not
The project is investigating methods to extract human-marked data from document forms such as surveys and tests. They can read questions, multiple-choice exam papers, and grade.
Username reconnaisance tool that checks the availability of a specified username on over 200 websites. Installation & Usage Clone from Github: $ git c
flomo-word-cloud 从flomo导出的笔记中生成词云 如何使用? 将本项目克隆到你的电脑上,使用如下的命令,安装所需python库 pip install -r requirements.txt 在项目里新建一个file文件夹,把所有从flomo导出的html文件放入其中 运行main
"Complexity" of Flags of the countries of the world Flags (png) from: https://flagcdn.com/w2560.zip https://flagpedia.net/download/images run: chmod +
PLY (Python Lex-Yacc) Copyright (C) 2001-2020 David M. Beazley (Dabeaz LLC) All rights reserved. Redistribution and use in source and binary forms, wi
deepl4subtitle Deeplを使って字幕ファイル(.sbv)を翻訳します。タイムスタンプも含めて出力しますが、翻訳時はタイムスタンプは文の一部とは切り離されるので、.sbvファイルをそのまま翻訳機に突っ込むよりも高精度な翻訳ができるはずです。 つかいかた 入力する.sbvファイルの前処理
BaseCrack Decoder For Base Encoding Schemes BaseCrack is a tool written in Python that can decode all alphanumeric base encoding schemes. This tool ca
kohtaaminen Meeting, rendezvous, confluence (Finnish kohtaaminen) mark up, down, and up again. Given a zip file containing a tree of html and media fi
MnSRank Maiden & Spell community player ranking based on tournament data. Why? 2021 just ended and this seemed like a cool idea. Elo doesn't work well
Indonesian Text Summarization Using FastAPI This is REST-API for Indonesian Text Summarization using Non-Negative Matrix Factorization for the algorit
python-sqlparse - Parse SQL statements sqlparse is a non-validating SQL parser for Python. It provides support for parsing, splitting and formatting S
snowflake-search-terms-in-ddl-views Search for terms(word / table / field name or any) under Snowflake schema names Version : 1.0v How to use ? Run th
cdvpp Chilean Digital Vaccination Pass Parser (CDVPP) parses digital vaccination passes from PDF files Reads a Digital Vaccination Pass PDF file as in