Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

Overview

TextBlob: Simplified Text Processing

Latest version Travis-CI

Homepage: https://textblob.readthedocs.io/

TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more.

from textblob import TextBlob

text = '''
The titular threat of The Blob has always struck me as the ultimate movie
monster: an insatiably hungry, amoeba-like mass able to penetrate
virtually any safeguard, capable of--as a doomed doctor chillingly
describes it--"assimilating flesh on contact.
Snide comparisons to gelatin be damned, it's a concept with the most
devastating of potential consequences, not unlike the grey goo scenario
proposed by technological theorists fearful of
artificial intelligence run rampant.
'''

blob = TextBlob(text)
blob.tags           # [('The', 'DT'), ('titular', 'JJ'),
                    #  ('threat', 'NN'), ('of', 'IN'), ...]

blob.noun_phrases   # WordList(['titular threat', 'blob',
                    #            'ultimate movie monster',
                    #            'amoeba-like mass', ...])

for sentence in blob.sentences:
    print(sentence.sentiment.polarity)
# 0.060
# -0.341

TextBlob stands on the giant shoulders of NLTK and pattern, and plays nicely with both.

Features

  • Noun phrase extraction
  • Part-of-speech tagging
  • Sentiment analysis
  • Classification (Naive Bayes, Decision Tree)
  • Tokenization (splitting text into words and sentences)
  • Word and phrase frequencies
  • Parsing
  • n-grams
  • Word inflection (pluralization and singularization) and lemmatization
  • Spelling correction
  • Add new models or languages through extensions
  • WordNet integration

Get it now

$ pip install -U textblob
$ python -m textblob.download_corpora

Examples

See more examples at the Quickstart guide.

Documentation

Full documentation is available at https://textblob.readthedocs.io/.

Requirements

  • Python >= 2.7 or >= 3.5

Project Links

License

MIT licensed. See the bundled LICENSE file for more details.

Comments
  • HTTP Error 503: Service Unavailable while using detect_language() and translate() from textblob

    HTTP Error 503: Service Unavailable while using detect_language() and translate() from textblob

    python:3.5 textblob:0.15.1

    seems it happened before and fixed in #148

    the detail logs File "/usr/local/lib/python3.5/site-packages/textblob/blob.py", line 562, in detect_language return self.translator.detect(self.raw) File "/usr/local/lib/python3.5/site-packages/textblob/translate.py", line 72, in detect response = self._request(url, host=host, type_=type_, data=data) File "/usr/local/lib/python3.5/site-packages/textblob/translate.py", line 92, in _request resp = request.urlopen(req) File "/usr/local/lib/python3.5/urllib/request.py", line 163, in urlopen return opener.open(url, data, timeout) File "/usr/local/lib/python3.5/urllib/request.py", line 472, in open response = meth(req, response) File "/usr/local/lib/python3.5/urllib/request.py", line 582, in http_response 'http', request, response, code, msg, hdrs) File "/usr/local/lib/python3.5/urllib/request.py", line 504, in error result = self._call_chain(*args) File "/usr/local/lib/python3.5/urllib/request.py", line 444, in _call_chain result = func(*args) File "/usr/local/lib/python3.5/urllib/request.py", line 696, in http_error_302 return self.parent.open(new, timeout=req.timeout) File "/usr/local/lib/python3.5/urllib/request.py", line 472, in open response = meth(req, response) File "/usr/local/lib/python3.5/urllib/request.py", line 582, in http_response 'http', request, response, code, msg, hdrs) File "/usr/local/lib/python3.5/urllib/request.py", line 510, in error return self._call_chain(*args) File "/usr/local/lib/python3.5/urllib/request.py", line 444, in _call_chain result = func(*args) File "/usr/local/lib/python3.5/urllib/request.py", line 590, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp)

    bug please-help 
    opened by craigchen1990 21
  • Language Detection Not Working (HTTP Error 503: Service Unavailable)

    Language Detection Not Working (HTTP Error 503: Service Unavailable)

    from textblob import TextBlob txt = u"Test Language Detection" b = TextBlob(txt) b.detect_language()

    It is giving "HTTPError: HTTP Error 503: Service Unavailable"

    Python Version: 2.7.6 TextBlob Version: 0.11.1 OS: Ubuntu 14.04 LTS & CentOS 6.8

    opened by manurajhada 20
  • ModuleNotFoundError: No module named '_sqlite3'

    ModuleNotFoundError: No module named '_sqlite3'

    Hello,

    I'm migrating my script from my Mac to a AWS Linux instance. I upgraded the AWS instance to Python 3.6 before importing packages, including textbook. Now I get this error and cannot find where it's coming from. I'm not the greatest python programmer, but I did have it running perfectly on my Mac before installing it on AWS.

    Here's the entire Traceback:

    Traceback (most recent call last): File "wikiparser20170801.py", line 8, in from textblob import TextBlob File "/usr/local/lib/python3.6/site-packages/textblob/init.py", line 9, in from .blob import TextBlob, Word, Sentence, Blobber, WordList File "/usr/local/lib/python3.6/site-packages/textblob/blob.py", line 28, in import nltk File "/usr/local/lib/python3.6/site-packages/nltk/init.py", line 137, in from nltk.stem import * File "/usr/local/lib/python3.6/site-packages/nltk/stem/init.py", line 29, in from nltk.stem.snowball import SnowballStemmer File "/usr/local/lib/python3.6/site-packages/nltk/stem/snowball.py", line 26, in from nltk.corpus import stopwords File "/usr/local/lib/python3.6/site-packages/nltk/corpus/init.py", line 66, in from nltk.corpus.reader import * File "/usr/local/lib/python3.6/site-packages/nltk/corpus/reader/init.py", line 105, in from nltk.corpus.reader.panlex_lite import * File "/usr/local/lib/python3.6/site-packages/nltk/corpus/reader/panlex_lite.py", line 15, in import sqlite3 File "/usr/local/lib/python3.6/sqlite3/init.py", line 23, in from sqlite3.dbapi2 import * File "/usr/local/lib/python3.6/sqlite3/dbapi2.py", line 27, in from _sqlite3 import * ModuleNotFoundError: No module named '_sqlite3

    opened by arnieadm35 17
  • correct() returns empty object

    correct() returns empty object

    I tried using spell checking but correct() method returns an empty object. Following shows the method call on a terminal:

    >>> from textblob import TextBlob
    >>> b = TextBlob("I havv goood speling!")
    >>> b.correct()
    TextBlob("")
    >>> print(b.correct())
    
    >>> 
    

    I couldn't find a fix to this. I'm running Python 2.7.6 on Linux.

    opened by shubhams 14
  • Translation not working - NotTranslated: Translation API returned the input string unchanged.

    Translation not working - NotTranslated: Translation API returned the input string unchanged.

    Hi, The translation is not working. thanks in advance,

    In [1]: from textblob import TextBlob

    In [2]: en_blob = TextBlob(u'Simple is better than complex.')

    In [3]: en_blob.translate(to='es')

    NotTranslated Traceback (most recent call last) in () ----> 1 en_blob.translate(to='es')

    /usr/local/lib/python2.7/dist-packages/textblob-0.11.0-py2.7.egg/textblob/blob.pyc in translate(self, from_lang, to) 507 from_lang = self.translator.detect(self.string) 508 return self.class(self.translator.translate(self.raw, --> 509 from_lang=from_lang, to_lang=to)) 510 511 def detect_language(self):

    /usr/local/lib/python2.7/dist-packages/textblob-0.11.0-py2.7.egg/textblob/translate.pyc in translate(self, source, from_lang, to_lang, host, type_) 43 return self.get_translation_from_json5(json5) 44 else: ---> 45 raise NotTranslated('Translation API returned the input string unchanged.') 46 47 def detect(self, source, host=None, type=None):

    NotTranslated: Translation API returned the input string unchanged.

    opened by edgaralts 13
  • Add Greedy Average Perceptron POS tagger

    Add Greedy Average Perceptron POS tagger

    Hi,

    I'm preparing a pull request for you, for a new POS tagger. This is the first time I've tried to contribute to someone else's project, so probably there'll be some weird teething pain stuff. Also I spend all day writing research code, so maybe parts of my style are atrocious :p.

    The two main files are:

    https://github.com/syllog1sm/TextBlob/blob/feature/greedy_ap_tagger/text/taggers.py https://github.com/syllog1sm/TextBlob/blob/feature/greedy_ap_tagger/text/_perceptron.py

    I'm not quite done, but it's passing tests and its numbers are much better than the taggers you currently have hooks for:

    NLTKTagger: 94.0 / 3m52 PatternTagger: 93.5 / 26s PerceptronTagger: 96.8 / 16s

    Accuracy figures refer to sections 22-24 of the Wall Street Journal, a common English evaluation. There's a table of some accuracies from the literature here: http://aclweb.org/aclwiki/index.php?title=POS_Tagging_(State_of_the_art) . Speeds refer to time taken to tag the 129,654 words of input, including initialisation, on my Macbook Air.

    If you check out that link, you'll see that the tagger's about 1% short of the pace for state-of-the-art accuracy. My Cython implementation has slightly better results, about 97.1, and it's a fair bit faster too. It's not very difficult to add some of the extra features to the Python implementation, or to improve its efficiency. Or we could hook in the Cython implementation, although that comes with much more baggage.

    I think it's nice having the tagger in ~200 lines of pure Python though, with no dependencies. It should be fairly language independent too --- I'll run some tests to see how it does.

    opened by syllog1sm 13
  • error intranslation

    error intranslation

    url not found error sometime:

    File "/usr/local/lib/python3.8/dist-packages/textblob/blob.py", line 546, in translate return self.class(self.translator.translate(self.raw, File "/usr/local/lib/python3.8/dist-packages/textblob/translate.py", line 54, in translate response = self.request(url, host=host, type=type_, data=data) File "/usr/local/lib/python3.8/dist-packages/textblob/translate.py", line 92, in _request resp = request.urlopen(req) File "/usr/lib/python3.8/urllib/request.py", line 222, in urlopen return opener.open(url, data, timeout) File "/usr/lib/python3.8/urllib/request.py", line 531, in open response = meth(req, response) File "/usr/lib/python3.8/urllib/request.py", line 640, in http_response response = self.parent.error( File "/usr/lib/python3.8/urllib/request.py", line 569, in error return self._call_chain(*args) File "/usr/lib/python3.8/urllib/request.py", line 502, in _call_chain result = func(*args) File "/usr/lib/python3.8/urllib/request.py", line 649, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 404: Not Found

    opened by mannan291 12
  • NaiveBayesClassifier taking too long

    NaiveBayesClassifier taking too long

    Hi, I've a small dataset of 1000 tweets which I've classify in pos/neg for training. When I tried to use it at the NaiveBayesClassifier() it tooks like 10-15min to return a result... Is there a way to save the result of the classifier like a dump and reuse that for further classifications ?

    Thanks

    opened by canivel 12
  • Deploying TextBlob on remote server

    Deploying TextBlob on remote server

    Hi,

    I am trying to deploy TextBlob on remote server hosted on heroku. Heroku to my knowledge uses pip freeze > requirements.txt to understand the dependencies and install them on the remote server.

    The code works perfectly on my local machine, but on the remote server looks for the NLTK corpus and throws an exception.

    How do I install TextBlob dependencies on remote server?

    I am using virtualenv

    opened by seekshreyas 11
  • since 0.7.1 having trouble with the package

    since 0.7.1 having trouble with the package

    On both my mac and linux machines I have the same problem with 0.7.1

    from text.blob import TextBlob Traceback (most recent call last): File "", line 1, in File "text.py", line 5, in from text.blob import TextBlob ImportError: No module named blob

    my sys.path does not contain the textblob module

    import sys for p in sys.path: ... print p ...

    /Library/Python/2.7/site-packages/ipython-2.0.0_dev-py2.7.egg /Library/Python/2.7/site-packages/matplotlib-1.3.0-py2.7-macosx-10.8-intel.egg /Library/Python/2.7/site-packages/numpy-1.9.0.dev_fde3dee-py2.7-macosx-10.8-x86_64.egg /Library/Python/2.7/site-packages/pandas-0.12.0_485_g02612c3-py2.7-macosx-10.8-x86_64.egg /Library/Python/2.7/site-packages/pymc-2.3a-py2.7-macosx-10.8-x86_64.egg /Library/Python/2.7/site-packages/scikit_learn-0.14_git-py2.7-macosx-10.8-x86_64.egg /Library/Python/2.7/site-packages/scipy-0.14.0.dev_4938da3-py2.7-macosx-10.8-x86_64.egg /Library/Python/2.7/site-packages/statsmodels-0.6.0-py2.7-macosx-10.8-x86_64.egg /Library/Python/2.7/site-packages/readline-6.2.4.1-py2.7-macosx-10.7-intel.egg /Library/Python/2.7/site-packages/nose-1.3.0-py2.7.egg /Library/Python/2.7/site-packages/six-1.4.1-py2.7.egg /Library/Python/2.7/site-packages/pyparsing-1.5.7-py2.7.egg /Library/Python/2.7/site-packages/pytz-2013.7-py2.7.egg /Library/Python/2.7/site-packages/pyzmq-13.1.0-py2.7-macosx-10.6-intel.egg /Library/Python/2.7/site-packages/pika-0.9.13-py2.7.egg /Library/Python/2.7/site-packages/Jinja2-2.7.1-py2.7.egg /Library/Python/2.7/site-packages/MarkupSafe-0.18-py2.7-macosx-10.8-intel.egg /Library/Python/2.7/site-packages/patsy-0.2.1-py2.7.egg /Library/Python/2.7/site-packages/Pygments-1.6-py2.7.egg /Library/Python/2.7/site-packages/Sphinx-1.2b3-py2.7.egg /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python27.zip /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7 /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-darwin /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-mac /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-mac/lib-scriptpackages /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-tk /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-old /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/PyObjC /Library/Python/2.7/site-packages

    despite it being there. I have uninstalled and reinstalled and tried all sorts of things:

    mbpdar:deaas daren$ ls /Library/Python/2.7/site-packages/te* /Library/Python/2.7/site-packages/text: /Library/Python/2.7/site-packages/textblob-0.7.1-py2.7.egg-info:

    I've verified the init.py doesn't have odd characters. if I change to the /Library/Python/2.7/site-packages/text folder I am able to import:

    mbpdar:deaas daren$ cd /Library/Python/2.7/site-packages/text mbpdar:text daren$ python Python 2.7.2 (default, Oct 11 2012, 20:14:37) [GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin Type "help", "copyright", "credits" or "license" for more information.

    from text.blob import TextBlob

    I cannot figure out what changed that might cause this.

    Thanks in advance Daren

    opened by darenr 11
  • (after 0.5.1) - AttributeError: 'module' object has no attribute 'compat'

    (after 0.5.1) - AttributeError: 'module' object has no attribute 'compat'

    Traceback (most recent call last): File "sentiment.py", line 1, in from text.blob import TextBlob File "/usr/local/lib/python2.7/dist-packages/text/blob.py", line 149, in @nltk.compat.python_2_unicode_compatible AttributeError: 'module' object has no attribute 'compat'

    bug 
    opened by ghost 11
  • Getting wrong value

    Getting wrong value

    from textblob import TextBlob
    
    text = "Hi, I'm from Canada"
    text2 = TextBlob(text)
    Correct = text2.correct()
    print(Correct)
    

    Hi when I run the above code I get output I, I"m from Canada

    which is wrong, am I doing something wrong here? please help

    opened by Mank0o 0
  • Joining TextBlobs / Sentence

    Joining TextBlobs / Sentence

    Not sure if this is a bug or feature request, I do enjoy that I can concentrate your Objects like strings. Now I wanted to concentrate a list, but ran into the following:

    " ".join(storedSentences)
    

    Getting: TypeError: sequence item 0: expected str instance, Sentence found

    Unfortunately, the other way does not work either:

    Sentence(" ").join(storedSentences)
    

    TypeError: sequence item 0: expected str instance, Sentence found

    Maybe I am doing it wrong?

    PS: Great library! Especially the TextBlobs Are Like Python Strings! makes things really easy, thanks for implementing that :)

    opened by thomasf1 0
  • Modify TextBlob sentiment prediction algorithm

    Modify TextBlob sentiment prediction algorithm

    I am trying to work on a use-case which requires predicting the polarity but the result is not accurate. Our main focus is on the -ve inputs but it is unable to find it with confidence. I tried to go through the github code base and understand how exactly the sentiment is predicted by the algo but was unable to get a clear picture.

    So I have 3 questions:

    1. Can we modify and retrain the the algorithm by passing more training data? If YES, then how can we do that?

    2. Textblob sentiment analysis using Naive Bayes but what I want to understand is what steps are happening after passing the data to tb = TextBlob(data) and then calling tb.sentiment on it. I would really appreciate if I can have a detailed steps including preprocessing, etc.

    3. I am performing the following preprocessing steps before passing the data to TextBlob:

      • removing numbers, dates, months, urls, hashtags, mentions, etc
      • lowercasing,
      • removing punctuation marks
      • stop word removal and converting -ve words like don't to just not as do is a stop word, etc

      Can you suggest if removing/ adding any of the above steps will lead to grater confidence & accuracy in polarity prediction?

    opened by Deepankar-98 0
  • Errors occurred when using Naive Bayes for sentiment classification

    Errors occurred when using Naive Bayes for sentiment classification

    1. As the question, when I use the Bayesian classifier for emotion classification, due to the excessive amount of data, when the amount of data exceeds 10,000, it will be automatically killed by the system, and there is no problem when the amount of data is not large image

    2. How do you save a trained naïve Bayes model?

    opened by yaoysyao 0
  • Detecting language  / get HTTP Error 400?

    Detecting language / get HTTP Error 400?

    Hello - i try to detect a language from a word with this code:

    from textblob import TextBlob
    b = TextBlob("bonjour")
    print(b.detect_language())
    

    But unfortunately i get this error:

    $ python exmplTextBlob.py
    Traceback (most recent call last):
      File "C:\Users\Polzi\Documents\DEV\Python-Diverses\Textblob\exmplTextBlob.py", line 4, in <module>
        print(b.detect_language())
      File "C:\Users\Polzi\Documents\DEV\.venv\test\lib\site-packages\textblob\blob.py", line 597, in detect_language
        return self.translator.detect(self.raw)
      File "C:\Users\Polzi\Documents\DEV\.venv\test\lib\site-packages\textblob\translate.py", line 76, in detect
        response = self._request(url, host=host, type_=type_, data=data)
      File "C:\Users\Polzi\Documents\DEV\.venv\test\lib\site-packages\textblob\translate.py", line 96, in _request
        resp = request.urlopen(req)
      File "C:\Users\Polzi\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 214, in urlopen
        return opener.open(url, data, timeout)
      File "C:\Users\Polzi\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 523, in open
        response = meth(req, response)
      File "C:\Users\Polzi\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 632, in http_response
        response = self.parent.error(
      File "C:\Users\Polzi\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 561, in error
        return self._call_chain(*args)
      File "C:\Users\Polzi\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 494, in _call_chain
        result = func(*args)
      File "C:\Users\Polzi\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 641, in http_error_default
        raise HTTPError(req.full_url, code, msg, hdrs, fp)
    urllib.error.HTTPError: HTTP Error 400: Bad Request
    
    

    Why is that - am i doing anything wrong?

    opened by Rapid1898-code 2
  • Python 3.9 compatibility

    Python 3.9 compatibility

    Hello I want to thank you for the project and comment that checking in PyPi I found that it was compatible up to Python 3.8, however I am in 3.9 and it works properly, I would like to know how it can be updated in PyPi. I take this opportunity to ask you if Textblob will have support for 3.11, which will be out soon? Thanks.

    opened by xasg 0
Owner
Steven Loria
Always a student, forever a junior developer
Steven Loria
A python wrapper around the ZPar parser for English.

NOTE This project is no longer under active development since there are now really nice pure Python parsers such as Stanza and Spacy. The repository w

ETS 49 Sep 12, 2022
Multilingual text (NLP) processing toolkit

polyglot Polyglot is a natural language pipeline that supports massive multilingual applications. Free software: GPLv3 license Documentation: http://p

RAMI ALRFOU 2.1k Jan 07, 2023
Fully featured implementation of Routing Transformer

Routing Transformer A fully featured implementation of Routing Transformer. The paper proposes using k-means to route similar queries / keys into the

Phil Wang 246 Jan 02, 2023
Gold standard corpus annotated with verb-preverb connections for Hungarian.

Hungarian Preverb Corpus A gold standard corpus manually annotated with verb-preverb connections for Hungarian. corpus The corpus consist of the follo

RIL Lexical Knowledge Representation Research Group 3 Jan 27, 2022
leaking paid token generator that was a shit lmao for 100$ haha

Discord-Token-Generator-Leaked leaking paid token generator that was a shit lmao for 100$ he selling it for 100$ wth here the code enjoy don't forget

Keevo 5 Apr 15, 2022
The code for two papers: Feedback Transformer and Expire-Span.

transformer-sequential This repo contains the code for two papers: Feedback Transformer Expire-Span The training code is structured for long sequentia

Meta Research 125 Dec 25, 2022
Script and models for clustering LAION-400m CLIP embeddings.

clustering-laion400m Script and models for clustering LAION-400m CLIP embeddings. Models were fit on the first million or so image embeddings. A subje

Peter Baylies 22 Oct 04, 2022
This is an incredibly powerful calculator that is capable of many useful day-to-day functions.

Description 💻 This is an incredibly powerful calculator that is capable of many useful day-to-day functions. Such functions include solving basic ari

Jordan Leich 37 Nov 19, 2022
An Analysis Toolkit for Natural Language Generation (Translation, Captioning, Summarization, etc.)

VizSeq is a Python toolkit for visual analysis on text generation tasks like machine translation, summarization, image captioning, speech translation

Facebook Research 409 Oct 28, 2022
Use Tensorflow2.7.0 Build OpenAI'GPT-2

TF2_GPT-2 Use Tensorflow2.7.0 Build OpenAI'GPT-2 使用最新tensorflow2.7.0构建openai官方的GPT-2 NLP模型 优点 使用无监督技术 拥有大量词汇量 可实现续写(堪比“xx梦续写”) 实现对话后续将应用于FloatTech的Bot

Watermelon 9 Sep 13, 2022
IMDB film review sentiment classification based on BERT's supervised learning model.

IMDB film review sentiment classification based on BERT's supervised learning model. On the other hand, the model can be extended to other natural language multi-classification tasks.

Paris 1 Apr 17, 2022
Library for Russian imprecise rhymes generation

TOM RHYMER Library for Russian imprecise rhymes generation. Quick Start Generate rhymes by any given rhyme scheme (aabb, abab, aaccbb, etc ...): from

Alexey Karnachev 6 Oct 18, 2022
PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language" from Meta AI

data2vec-pytorch PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language" from Meta AI (F

Aryan Shekarlaban 105 Jan 04, 2023
A demo of chinese asr

chinese_asr_demo 一个端到端的中文语音识别模型训练、测试框架 具备数据预处理、模型训练、解码、计算wer等等功能 训练数据 训练数据采用thchs_30,

4 Dec 09, 2021
Codes to pre-train Japanese T5 models

t5-japanese Codes to pre-train a T5 (Text-to-Text Transfer Transformer) model pre-trained on Japanese web texts. The model is available at https://hug

Megagon Labs 37 Dec 25, 2022
MHtyper is an end-to-end pipeline for recognized the Forensic microhaplotypes in Nanopore sequencing data.

MHtyper is an end-to-end pipeline for recognized the Forensic microhaplotypes in Nanopore sequencing data. It is implemented using Python.

willow 6 Jun 27, 2022
End-2-end speech synthesis with recurrent neural networks

Introduction New: Interactive demo using Google Colaboratory can be found here TTS-Cube is an end-2-end speech synthesis system that provides a full p

Tiberiu Boros 214 Dec 07, 2022
NLTK Source

Natural Language Toolkit (NLTK) NLTK -- the Natural Language Toolkit -- is a suite of open source Python modules, data sets, and tutorials supporting

Natural Language Toolkit 11.4k Jan 04, 2023
Line as a Visual Sentence: Context-aware Line Descriptor for Visual Localization

Line as a Visual Sentence with LineTR This repository contains the inference code, pretrained model, and demo scripts of the following paper. It suppo

SungHo Yoon 158 Dec 27, 2022
Practical Natural Language Processing Tools for Humans is build on the top of Senna Natural Language Processing (NLP)

Practical Natural Language Processing Tools for Humans is build on the top of Senna Natural Language Processing (NLP) predictions: part-of-speech (POS) tags, chunking (CHK), name entity recognition (

jawahar 20 Apr 30, 2022