Takes a string and puts it through different languages in Google Translate a requested amount of times, returning nonsense.

Overview

PythonTextObfuscator

Takes a string and puts it through different languages in Google Translate a requested amount of times, returning nonsense. Example

Requirements:

python3

For the Selenium Obfuscator:

    -Selenium
    
    -Firefox
    
    -Geckodriver

In the Selenium Obfuscator:

-The major benefit is that you can translate excel documents, the downside is that after 10 or so document translations, Google blocks your ip for a while.

-Translation is generally slower and more limited using selenium as a browser tab is being used to scrape the data. Also beware of RAM usage.

-May no longer be supported in the future due to its drawbacks.

In the Urllib Obfuscator:

-Translation is generally faster and uses very little resources as only html is downloaded through a request. Multiprocessing also allows simultanious requests and can be used to the full extent without worrying about RAM usage.

—Split by length is faster and uses less requests (better for longer texts)

—Split by newline is slower and uses more requests but adds much more translation variety.

-Reminder: Since google has a url request limit, you'll need to switch VPN locations when the request limit is hit.

    ——Don't worry too much though, as it takes quite a bit of requests to get to that point, and the block only lasts for around an hour.
You might also like...
Translate - a PyTorch Language Library

NOTE PyTorch Translate is now deprecated, please use fairseq instead. Translate - a PyTorch Language Library Translate is a library for machine transl

Auto translate textbox from Japanese to English or Indonesia
Auto translate textbox from Japanese to English or Indonesia

priconne-auto-translate Auto translate textbox from Japanese to English or Indonesia How to use Install python first, Anaconda is recommended Install

translate using your voice
translate using your voice

speech-to-text-translator Usage translate using your voice description this project makes translating a word easy, all you have to do is speak and...

translate using your voice

speech-to-text-translator Usage translate using your voice description this project makes translating a word easy, all you have to do is speak and...

This program do translate english words to portuguese

Python-Dictionary This program is used to translate english words to portuguese. Web-Scraping This program use BeautifulSoap to make web scraping, so

Translate U is capable of translating the text present in an image from one language to the other.
Translate U is capable of translating the text present in an image from one language to the other.

Translate U is capable of translating the text present in an image from one language to the other. The app uses OCR and Google translate to identify and translate across 80+ languages.

Graphical user interface for Argos Translate
Graphical user interface for Argos Translate

Argos Translate GUI Website | GitHub | PyPI Graphical user interface for Argos Translate. Install pip3 install argostranslategui

Use the state-of-the-art m2m100 to translate large data on CPU/GPU/TPU. Super Easy!
Use the state-of-the-art m2m100 to translate large data on CPU/GPU/TPU. Super Easy!

Easy-Translate is a script for translating large text files in your machine using the M2M100 models from Facebook/Meta AI. We also privide a script fo

Search for documents in a domain through Google. The objective is to extract metadata

MetaFinder - Metadata search through Google _____ __ ___________ .__ .___ / \

Comments
  • Attempt to decode JSON with unexpected mimetype: text/plain

    Attempt to decode JSON with unexpected mimetype: text/plain

    I'm not sure what's causing this, as the last time I tried this release, this issue was not present. If it's accessing content server-side, then it might be that the server has had a config change resulting in it returning a different mimetype?

    I get the error message below consistently in the console, with %2E being added to the end of the URL each time. It does seem like some translation does happen; in this case, I inputted "Test", and the URL ended with "Hlola".

    https://translate.alefvanoon.xyz/api/v1/zu/mi/Hlola%2E 0, message='Attempt to decode JSON with unexpected mimetype: text/plain; charset=utf-8', url=URL('https://translate.alefvanoon.xyz/api/v1/zu/mi/Hlola')

    From what I've gathered looking online, the issue lies in either line 13, line 469, or both.

    return (await response.json())['translation'].replace('/','⁄')

    text = (await response.json())['translation'].replace('/','⁄')

    Some of the solutions online referred to adding "content_type=None" or "content_type='text/plain'" into the brackets after "json", but this only seemed to cause further issues for me.

    opened by UltraHylia 2
  • Program Freezes Up and Looping Error

    Program Freezes Up and Looping Error

    When you have Chinese (Simplified) and/or Chinese (Traditional) enabled in the language selector, the program can freeze and an error loops in the console. It happens no matter what other languages are enabled.

    https://user-images.githubusercontent.com/60769253/197659506-38871035-e311-4710-9eb9-ac2d7387841f.mp4

    opened by DerpTaco99921 0
Releases(v0.4)
  • v0.4(Feb 2, 2022)

    Rebuilt from the ground up with a new GUI and translation method.

    Changes:

    -Improved GUI.

    -Translations are retrieved from a front-end to Google Translate called Lingva, which removes the issue with being blocked for doing too many requests.

    -Translations are done in an asynchronous function using aiohttp instead of a process pool, which is optimal for large bulk translations.

    -Removed selenium obfuscation.

    Additions: -Importing and saving text files. -Language Selector to activate or deactivate any individual language. -Language setting for the result. -Three different split methods: ____-Initial ________-Text is split by length before being passed into the obfuscate function. ________-Faster as less requests are made. ________-Different languages for each piece. ________-Tabs not preserved. ____-Continuous ________-Text is split by length inside the obfuscate function. ________-Faster as less requests are made. ________-Same languages for each piece. ________-Tabs not preserved. ____-Newline ________-Text is split by newlines and tabs. ________-Slower as more requests are made. ________-Every single line is translated with different languages. ________-Tabs preserved. -Translation Generator which creates a .csv file containing multiple translations of the same text: ____-Repeat mode obfuscates the original text each time, adding the result in each new column. ____-Continue mode obfuscates the results from each subsequent obfuscation, adding the result in each new column.

    Source code(tar.gz)
    Source code(zip)
    Python.Text.Obfuscator.v0.4.zip(15.75 KB)
  • v0.3.1c-r2(Dec 23, 2021)

  • v0.3.1c(Dec 23, 2021)

    Newlines no longer get messed up in Urllib Obfuscator. Added a choice to split by length or by newlines. —Split by length is faster and uses less requests (better for longer texts) —Split by newline is slower and uses more requests but adds much more translation variety. Reminder: Since google has a URL request limit, you'll need to switch VPN locations when the request limit is hit.

    Source code(tar.gz)
    Source code(zip)
    Python.Text.Obfuscator.v0.3.1c.zip(51.63 KB)
  • v0.3.1b(Dec 23, 2021)

  • v0.3.1a(Dec 23, 2021)

  • v0.3(Dec 23, 2021)

    I made massive improvements to the speed of the obfuscation thanks to learning about urllib.

    For example, I did translated the same ~2300 character long string of text 10 times in the old and new version; the old one took 38.8 seconds while the new one took only 6.8 seconds.

    In addition, the capacity to add a larger amount of characters is far increased as it doesn't require Firefox tabs to be open and eating up ram.

    As a test I translated the entire Among Us Wikipedia page 50 times (with a character count of over 60 thousand!), and it only took only 114 seconds to finish translating. Using the old obfuscator I wouldn't be able to translate more than half that amount, and it would take ages to complete (Like 10 mins or more).

    Unfortunately for this version the Excel Obfuscator is removed until I can figure out how to get it to work in urllib, if I can't then I'll probably add it back it with Selenium.

    At least if you couldn't get selenium to work on your computer for the previous versions you don't have to worry about getting it for this.

    Source code(tar.gz)
    Source code(zip)
    Python.Text.Obfuscator.v0.3.zip(5.73 KB)
  • v0.2.2(Dec 23, 2021)

  • v0.2.1b(Dec 23, 2021)

  • v0.2.1a(Dec 23, 2021)

    Fixed TimeoutExceptions for the string translations (textbox input) obfuscation. You can now do as many translations as you want without worrying about encountering an error. Same for amount of characters (as long as your PC can handle of course). As for excel translations they remain unchanged — since I can't do anything about Google's Document translation limit — so just switch locations on VPN like usual after 10 translations for the Excel Obfuscator.

    Source code(tar.gz)
    Source code(zip)
    Python.Text.Obfuscator.v0.2.1.zip(5.88 KB)
  • v0.2(Dec 23, 2021)

  • v0.1b(Dec 23, 2021)

  • v0.1a(Dec 23, 2021)

中文生成式预训练模型

T5 PEGASUS 中文生成式预训练模型,以mT5为基础架构和初始权重,通过类似PEGASUS的方式进行预训练。 详情可见:https://kexue.fm/archives/8209 Tokenizer 我们将T5 PEGASUS的Tokenizer换成了BERT的Tokenizer,它对中文更

410 Jan 03, 2023
An A-SOUL Text Generator Based on CPM-Distill.

ASOUL-Generator-Backend 本项目为 https://asoul.infedg.xyz/ 的后端。 模型为基于 CPM-Distill 的 transformers 转化版本 CPM-Generate-distill 训练而成。

infinityedge 46 Dec 11, 2022
Neural text generators like the GPT models promise a general-purpose means of manipulating texts.

Boolean Prompting for Neural Text Generators Neural text generators like the GPT models promise a general-purpose means of manipulating texts. These m

Jeffrey M. Binder 20 Jan 09, 2023
Fastseq 基于ONNXRUNTIME的文本生成加速框架

Fastseq 基于ONNXRUNTIME的文本生成加速框架

Jun Gao 9 Nov 09, 2021
test

Lidar-data-decode In this project, you can decode your lidar data frame(pcap file) and make your own datasets(test dataset) in Windows without any hug

46 Dec 05, 2022
An open-source NLP library: fast text cleaning and preprocessing.

An open-source NLP library: fast text cleaning and preprocessing

Iaroslav 21 Mar 18, 2022
Sentence Embeddings with BERT & XLNet

Sentence Transformers: Multilingual Sentence Embeddings using BERT / RoBERTa / XLM-RoBERTa & Co. with PyTorch This framework provides an easy method t

Ubiquitous Knowledge Processing Lab 9.1k Jan 02, 2023
Optimal Transport Tools (OTT), A toolbox for all things Wasserstein.

Optimal Transport Tools (OTT), A toolbox for all things Wasserstein. See full documentation for detailed info on the toolbox. The goal of OTT is to pr

OTT-JAX 255 Dec 26, 2022
BookNLP, a natural language processing pipeline for books

BookNLP BookNLP is a natural language processing pipeline that scales to books and other long documents (in English), including: Part-of-speech taggin

654 Jan 02, 2023
Full Spectrum Bioinformatics - a free online text designed to introduce key topics in Bioinformatics using the Python

Full Spectrum Bioinformatics is a free online text designed to introduce key topics in Bioinformatics using the Python programming language. The text is written in interactive Jupyter Notebooks, whic

Jesse Zaneveld 33 Dec 28, 2022
Quick insights from Zoom meeting transcripts using Graph + NLP

Transcript Analysis - Graph + NLP This program extracts insights from Zoom Meeting Transcripts (.vtt) using TigerGraph and NLTK. In order to run this

Advit Deepak 7 Sep 17, 2022
This is a project built for FALLABOUT2021 event under SRMMIC, This project deals with NLP poetry generation.

FALLABOUT-SRMMIC 21 POETRY-GENERATION HINGLISH DESCRIPTION We have developed a NLP(natural language processing) model which automatically generates a

7 Sep 28, 2021
PRAnCER is a web platform that enables the rapid annotation of medical terms within clinical notes.

PRAnCER (Platform enabling Rapid Annotation for Clinical Entity Recognition) is a web platform that enables the rapid annotation of medical terms within clinical notes. A user can highlight spans of

Sontag Lab 39 Nov 14, 2022
Repository for Project Insight: NLP as a Service

Project Insight NLP as a Service Contents Introduction Features Installation Setup and Documentation Project Details Demonstration Directory Details H

Abhishek Kumar Mishra 286 Dec 06, 2022
Implementation of Multistream Transformers in Pytorch

Multistream Transformers Implementation of Multistream Transformers in Pytorch. This repository deviates slightly from the paper, where instead of usi

Phil Wang 47 Jul 26, 2022
Official code for Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset

Official code for our Interspeech 2021 - Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset [1]*. Visually-grounded spoken language datasets c

Ian Palmer 3 Jan 26, 2022
Exploring dimension-reduced embeddings

sleepwalk Exploring dimension-reduced embeddings This is the code repository. See here for the Sleepwalk web page. License and disclaimer This program

S. Anders's research group at ZMBH 91 Nov 29, 2022
Stanford CoreNLP provides a set of natural language analysis tools written in Java

Stanford CoreNLP Stanford CoreNLP provides a set of natural language analysis tools written in Java. It can take raw human language text input and giv

Stanford NLP 8.8k Jan 07, 2023
Fast topic modeling platform

The state-of-the-art platform for topic modeling. Full Documentation User Mailing List Download Releases User survey What is BigARTM? BigARTM is a pow

BigARTM 633 Dec 21, 2022
VMD Audio/Text control with natural language

This repository is a proof of principle for performing Molecular Dynamics analysis, in this case with the program VMD, via natural language commands.

Andrew White 13 Jun 09, 2022