A framework for detecting, highlighting and correcting grammatical errors on natural language text.

Overview

PyPI - License Visits Badge

Gramformer

Human and machine generated text often suffer from grammatical and/or typographical errors. It can be spelling, punctuation, grammatical or word choice errors. Gramformer is a library that exposes 3 seperate interfaces to a family of algorithms to detect, highlight and correct grammar errors. To make sure the corrections and highlights recommended are of high quality, it comes with a quality estimator. You can use Gramformer in one or more areas mentioned under the "use-cases" section below or any other usecase as you see fit. Gramformer stands on the shoulders of gaints, it combines some of the top notch researches in grammar correction. Note: It works at sentence levels and has been trained on 128 length sentences, so not (yet) suitable for long prose or paragraphs (stay tuned for upcoming releases)

Table of contents

Usecases for Gramformer

Area 1: Post-processing machine generated text

Machine-Language generation is becoming mainstream, so will post-processing machine generated text.

  • Conditioned Text generation output(Text2Text generation).
    • NMT: Machine Translated output.
    • ASR or STT: Speech to text output.
    • HTR: Handwritten text recognition output.
    • Paraphrase generation output.
  • Controlled Text generation output(Text generation with PPLM) [TBD].
  • Free-form text generation output(Text generation)[TBD].

Area 2:Human-In-The-Loop (HITL) text

  • Most Supervised NLU (Chatbots and Conversational) systems need humans/experts to enter or edit text that needs to be grammtical correct otherwise the quality of HITL data can degrade the model over a period of time

Area 3:Assisted writing for humans

  • Integrating into custom Text editors of your Apps. (A Poor man's grammarly, if you will)

Area 4:Custom Platform integration

As of today grammatical safety nets for authoring social contents (Post or Comments) or text in messaging platforms is very little (word level correction) or non-existent.The onus is on the author to install tools like grammarly to proof read.

  • Messaging platforms and Social platforms can highlight / correct grammtical errors automatically without altering the meaning or intent.

Installation

pip install git+https://github.com/PrithivirajDamodaran/Gramformer.git@v0.1

Quick Start

Correcter - [Available now]

from gramformer import Gramformer
import torch

def set_seed(seed):
  torch.manual_seed(seed)
  if torch.cuda.is_available():
    torch.cuda.manual_seed_all(seed)

set_seed(1212)


gf = Gramformer(models = 2, use_gpu=False) # 0=detector, 1=highlighter, 2=corrector, 3=all 

influent_sentences = [
    "Matt like fish",
    "the collection of letters was original used by the ancient Romans",
    "We enjoys horror movies",
    "Anna and Mike is going skiing",
    "I walk to the store and I bought milk",
    "We all eat the fish and then made dessert",
    "I will eat fish for dinner and drank milk",
    "what be the reason for everyone leave the company",
]   

for influent_sentence in influent_sentences:
    corrected_sentence = gf.correct(influent_sentence)
    print("[Input] ", influent_sentence)
    print("[Correction] ",corrected_sentence[0])
    print("-" *100)
[Input]  Matt like fish
[Correction]  Matt likes fish
----------------------------------------------------------------------------------------------------
[Input]  the collection of letters was original used by the ancient Romans
[Correction]  The collection of letters was originally used by the ancient Romans.
----------------------------------------------------------------------------------------------------
[Input]  We enjoys horror movies
[Correction]  We enjoy horror movies
----------------------------------------------------------------------------------------------------
[Input]  Anna and Mike is going skiing
[Correction]  Anna and Mike are going skiing
----------------------------------------------------------------------------------------------------
[Input]  I walk to the store and I bought milk
[Correction]  I walked to the store and bought milk.
----------------------------------------------------------------------------------------------------
[Input]  We all eat the fish and then made dessert
[Correction]  We all ate the fish and then made dessert
----------------------------------------------------------------------------------------------------
[Input]  I will eat fish for dinner and drank milk
[Correction]  I'll eat fish for dinner and drink milk.
----------------------------------------------------------------------------------------------------
[Input]  what be the reason for everyone leave the company
[Correction]  what can be the reason for everyone to leave the company.
----------------------------------------------------------------------------------------------------

Challenge with generative models

While Gramformer aims to post-process outputs from the generative models, Gramformer itself is a generative model. So the question arises, who will post-process the Gramformer outputs ? (I know, very meta :-)). In general all generative models have the tendency to generate spurious text sometimes, which we cannot control. So to make sure the gramformer grammar corrections (and highlights) are as accurate as possible, A quality estimator (QE) will be added. It can estimate a error correction quality score and use that as a filter on Top-N candidates to return only the best based on the score.

Correcter with QE estimator - [Coming soon !]

from gramformer import Gramformer
gf = Gramformer(models = 2, use_gpu=False) # 0=detector, 1=highlighter, 2=corrector, 3=all 
corrected_sentence = gf.correct(<your input sentence>, filter_by_quality=True, max_candidates=3)

Highlighter - [Coming soon !]

from gramformer import Gramformer
gf = Gramformer(models = 1, use_gpu=False) # 0=detector, 1=highlighter, 2=corrector, 3=all 
highlighted_sentence = gf.highlight(<your input sentence>)
[Input]  Matt like fish
[Highlight]  Matt <e> like </e> fish
----------------------------------------------------------------------------------------------------
[Input]  the collection of letters was original used by the ancient Romans
[Highlight]  the collection of letters was <e> original used </e> by the ancient Romans
----------------------------------------------------------------------------------------------------
[Input]  We enjoys horror movies
[Highlight]  We <e> enjoys horror </e> movies
----------------------------------------------------------------------------------------------------
[Input]  Anna and Mike is going skiing
[Highlight]  Anna and Mike <e> is going </e> skiing
----------------------------------------------------------------------------------------------------
[Input]  I walk to the store and I bought milk
[Highlight]  I <e> walk to </e> the store and I bought milk
----------------------------------------------------------------------------------------------------
[Input]  We all eat the fish and then made dessert
[Highlight]  We all <e> eat the </e> fish and then made dessert
----------------------------------------------------------------------------------------------------
[Input]  I will eat fish for dinner and drank milk
[Highlight]  I will eat fish for dinner and <e> drank milk </e> 
----------------------------------------------------------------------------------------------------
[Input]  what be the reason for everyone leave the company
[Highlight]  <e> what be </e> the reason <e> for everyone </e> <e> leave the </e> company
----------------------------------------------------------------------------------------------------
[Input]  One of the most important issue is the lack of parking spaces at the local mall.
[Highlight]  One of the most important <e> issue is </e> the lack of parking spaces at the local mall.
----------------------------------------------------------------------------------------------------
[Input]  The survey we performed recently showed that most of customers are satisfied.
[Highlight]  The survey we performed recently showed that most <e> of customers </e> are satisfied.
----------------------------------------------------------------------------------------------------
[Input]  I’ve loved classical music ever since I was child.
[Highlight]  I’ve loved classical music ever since I <e> was child </e>.
----------------------------------------------------------------------------------------------------

Detector - [Coming soon !]

from gramformer import Gramformer
gf = Gramformer(models = 0, use_gpu=False) # 0=detector, 1=highlighter, 2=corrector, 3=all 
grammar_fluency_score = gf.detect(<your input sentence>)

Models

Model Type Return status
prithivida/grammar_error_detector Classifier Label TBD (prithivida/parrot_fluency_on_BERT can be repurposed here, but I would recommend you wait :-))
prithivida/grammar_error_highlighter Seq2Seq Grammar errors enclosed in <e> and </e> Beta
prithivida/grammar_error_correcter Seq2Seq The corrected sentence Beta

Dataset

  • First idea is to generate the dataset using the techniques mentioned in the first paper highlighted in reference section. You can use the technique on anyone of the publicy available wikipedia edits datasets. Write some rules to filter only the grammatical edits, do some cleanup and thats it Bob's your uncle :-).
  • Second and possibly very complicated and $$$ way to get some 200M synthetic sentences. This is based on the last paper under references section. Not recommended but by all means knock yourself out if you are interested :-)
  • Third source is to repurpose the GEC Task data
  • I combined sources 1 and 3 to get my training data (still working on source 2, will keep you posted)
  • I ended up with ~1M records and after some heurtistics based filtering amounted to ~1/2M records.
  • It took ~12 hours to train each of the above models.

Benchmark

TBD (I will benchmark grammformer models against the following publicy available models: salesken/grammar_correction and flexudy/t5-small-wav2vec2-grammar-fixer shortly.

References

Citation

TBD

Comments
  • [Spacy error] Can't find model 'en'

    [Spacy error] Can't find model 'en'

    Hello I have successfully installed the Gramformer on my windows PC. but when I run, it gives the following error.

    Traceback (most recent call last):
      File "main.py", line 27, in <module>
        grammar_correction = Gramformer(models = 1, use_gpu=True)
      File "~~\.conda\envs\nlp-transformer\lib\site-packages\gramformer\gramformer.py", line 8, in __init__
        self.annotator = errant.load('en')
      File "~~\.conda\envs\nlp-transformer\lib\site-packages\errant\__init__.py", line 16, in load
        nlp = nlp or spacy.load(lang, disable=["ner"])
      File "~~\.conda\envs\nlp-transformer\lib\site-packages\spacy\__init__.py", line 30, in load
        return util.load_model(name, **overrides)
      File "~~\.conda\envs\nlp-transformer\lib\site-packages\spacy\util.py", line 175, in load_model
        raise IOError(Errors.E050.format(name=name))
    OSError: [E050] Can't find model 'en'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.
    
    opened by muzamil47 3
  • Commercial use issue

    Commercial use issue

    Hey @PrithivirajDamodaran

    The readme states that Gramformer versions above 1.0 are allowed for commercial use - however, this is not currently the case as the grammar_error_correcter_v1 model has been trained using the non-commercial WI&Locness data, even though the documentation states otherwise:

    The grammar_error_correcter_v1 model is actually identical to the previous grammar_error_correcter model which is trained using the non-commercial WI&Locness data – they have identical weights, which you can verify with this script

    As the models are the same, this means that both models have been trained using the non-commercial WI&Locness data, and the grammar_error_correcter_v1 model along with Gramformer v1.1 and v1.2 should not be allowed for commercial use.

    Could you please update the readme to clarify this, or upload a new model that has not been trained using WI&Locness?

    Thanks

    question 
    opened by SimonHFL 2
  • Use corrector for highligher

    Use corrector for highligher

    Hi @PrithivirajDamodaran

    This is a great framework. Is it possible (for now) to use model corrector (model=2) for the highlighter(model=1)? After getting some correction, match it to the input and give prefix and suffix () for the mismatch?

    Thanks

    question 
    opened by ilhamsyahids 2
  • Error loading the tokenizer in transformers==4.4.2

    Error loading the tokenizer in transformers==4.4.2

    I'm getting error when initializing the class object, specifically at tokenizer loading:

    In [6]: correction_tokenizer = AutoTokenizer.from_pretrained(correction_model_tag)
    ---------------------------------------------------------------------------
    Exception                                 Traceback (most recent call last)
    <ipython-input-6-d34dd9c5fe99> in <module>
    ----> 1 correction_tokenizer = AutoTokenizer.from_pretrained(correction_model_tag)
    
    ~/anaconda3/envs/npe/lib/python3.6/site-packages/transformers/models/auto/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
        414             tokenizer_class_py, tokenizer_class_fast = TOKENIZER_MAPPING[type(config)]
        415             if tokenizer_class_fast and (use_fast or tokenizer_class_py is None):
    --> 416                 return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
        417             else:
        418                 if tokenizer_class_py is not None:
    
    ~/anaconda3/envs/npe/lib/python3.6/site-packages/transformers/tokenization_utils_base.py in from_pretrained(cls, pretrained_model_name_or_path, *init_inputs, **kwargs)
       1703
       1704         return cls._from_pretrained(
    -> 1705             resolved_vocab_files, pretrained_model_name_or_path, init_configuration, *init_inputs, **kwargs
       1706         )
       1707
    
    ~/anaconda3/envs/npe/lib/python3.6/site-packages/transformers/tokenization_utils_base.py in _from_pretrained(cls, resolved_vocab_files, pretrained_model_name_or_path, init_configuration, *init_inputs, **kwargs)
       1774         # Instantiate tokenizer.
       1775         try:
    -> 1776             tokenizer = cls(*init_inputs, **init_kwargs)
       1777         except OSError:
       1778             raise OSError(
    
    ~/anaconda3/envs/npe/lib/python3.6/site-packages/transformers/models/t5/tokenization_t5_fast.py in __init__(self, vocab_file, tokenizer_file, eos_token, unk_token, pad_token, extra_ids, additional_special_tokens, **kwargs)
        134             extra_ids=extra_ids,
        135             additional_special_tokens=additional_special_tokens,
    --> 136             **kwargs,
        137         )
        138
    
    ~/anaconda3/envs/npe/lib/python3.6/site-packages/transformers/tokenization_utils_fast.py in __init__(self, *args, **kwargs)
         85         if fast_tokenizer_file is not None and not from_slow:
         86             # We have a serialization from tokenizers which let us directly build the backend
    ---> 87             fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)
         88         elif slow_tokenizer is not None:
         89             # We need to convert a slow tokenizer to build the backend
    
    Exception: data did not match any variant of untagged enum PyPreTokenizerTypeWrapper at line 1 column 329667
    

    transformers==4.4.2.

    The installation package didn't specify the transformers version that this library is using. What should be the correct version? Or is it version independent and it's something else?

    opened by zhangyilun 2
  • Figma Gramformer Plugin

    Figma Gramformer Plugin

    Figma is used in creating a lot of digital interfaces today, a Gramformer Figma plugin would go a long way. I'll be willing to design the interface for the plugin but I don't know how to make the plugin itself. I hope someone takes this up. This is a link to get started https://www.figma.com/plugin-docs/setup/

    enhancement 
    opened by ayoolafelix 2
  • README.md get_edits and get_highlight example small fixes

    README.md get_edits and get_highlight example small fixes

    Hi there, when I copy and pasted the examples in the README locally I noticed they were bugging out for the edits and highlights (were only pulling the first char of the sentence for errant). Providing the full sentence seemed to get the desired output.

    opened by parisac 1
  • Training dataset

    Training dataset

    Hi Prithiviraj,

    Is there any chance you'd be able to release the training dataset you used to train the Gramformer huggingface model? I see that there are some details on the slices of data that you brought together in the Readme, but it would be useful to be able to use the same data that you used.

    The main reason I'm asking is I'd like to create a model that can take correct text and add grammatical errors to it. So I was thinking I could take the dataset you used to train Gramformer and use the inverse to train a model that does the inverse. I can go through the data prep process as you did, but it would definitely be easier if I were able to reuse yours, and it might be useful for reproducibility for others as well.

    invalid question 
    opened by d4buss 1
  • OSError: Can't load config for 'prithivida/grammar_error_correcter'

    OSError: Can't load config for 'prithivida/grammar_error_correcter'

    Hi, I have been using your code for the last few days. Suddenly, it started to crash.

    Have a look at the code and error given below:

    Code (Link: https://huggingface.co/prithivida/grammar_error_correcter_v1):

    from gramformer import Gramformer
    import torch
    
    def set_seed(seed):
      torch.manual_seed(seed)
      if torch.cuda.is_available():
        torch.cuda.manual_seed_all(seed)
    
    set_seed(1212)
    
    
    gf = Gramformer(models = 2, use_gpu=False) # 0=detector, 1=highlighter, 2=corrector, 3=all 
    
    influent_sentences = [
        "Matt like fish",
        "the collection of letters was original used by the ancient Romans",
        "We enjoys horror movies",
        "Anna and Mike is going skiing",
        "I walk to the store and I bought milk",
        "We all eat the fish and then made dessert",
        "I will eat fish for dinner and drank milk",
        "what be the reason for everyone leave the company",
    ]   
    
    for influent_sentence in influent_sentences:
        corrected_sentence = gf.correct(influent_sentence)
        print("[Input] ", influent_sentence)
        print("[Correction] ",corrected_sentence[0])
        print("-" *100)
    

    Error

    404 Client Error: Not Found for url: https://huggingface.co/prithivida/grammar_error_correcter/resolve/main/config.json
    ---------------------------------------------------------------------------
    HTTPError                                 Traceback (most recent call last)
    /usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py in get_config_dict(cls, pretrained_model_name_or_path, **kwargs)
        491                 use_auth_token=use_auth_token,
    --> 492                 user_agent=user_agent,
        493             )
    
    7 frames
    /usr/local/lib/python3.7/dist-packages/transformers/file_utils.py in cached_path(url_or_filename, cache_dir, force_download, proxies, resume_download, user_agent, extract_compressed_file, force_extract, use_auth_token, local_files_only)
       1278             use_auth_token=use_auth_token,
    -> 1279             local_files_only=local_files_only,
       1280         )
    
    /usr/local/lib/python3.7/dist-packages/transformers/file_utils.py in get_from_cache(url, cache_dir, force_download, proxies, etag_timeout, resume_download, user_agent, use_auth_token, local_files_only)
       1441             r = requests.head(url, headers=headers, allow_redirects=False, proxies=proxies, timeout=etag_timeout)
    -> 1442             r.raise_for_status()
       1443             etag = r.headers.get("X-Linked-Etag") or r.headers.get("ETag")
    
    /usr/local/lib/python3.7/dist-packages/requests/models.py in raise_for_status(self)
        942         if http_error_msg:
    --> 943             raise HTTPError(http_error_msg, response=self)
        944 
    
    HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/prithivida/grammar_error_correcter/resolve/main/config.json
    
    During handling of the above exception, another exception occurred:
    
    OSError                                   Traceback (most recent call last)
    <ipython-input-10-0f43e537fe87> in <module>
         10 
         11 
    ---> 12 gf = Gramformer(models = 2, use_gpu=False) # 0=detector, 1=highlighter, 2=corrector, 3=all
         13 
         14 influent_sentences = [
    
    /usr/local/lib/python3.7/dist-packages/gramformer/gramformer.py in __init__(self, models, use_gpu)
         14 
         15     if models == 2:
    ---> 16         self.correction_tokenizer = AutoTokenizer.from_pretrained(correction_model_tag)
         17         self.correction_model     = AutoModelForSeq2SeqLM.from_pretrained(correction_model_tag)
         18         self.correction_model     = self.correction_model.to(device)
    
    /usr/local/lib/python3.7/dist-packages/transformers/models/auto/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)
        400         kwargs["_from_auto"] = True
        401         if not isinstance(config, PretrainedConfig):
    --> 402             config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs)
        403 
        404         use_fast = kwargs.pop("use_fast", True)
    
    /usr/local/lib/python3.7/dist-packages/transformers/models/auto/configuration_auto.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
        428         """
        429         kwargs["_from_auto"] = True
    --> 430         config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
        431         if "model_type" in config_dict:
        432             config_class = CONFIG_MAPPING[config_dict["model_type"]]
    
    /usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py in get_config_dict(cls, pretrained_model_name_or_path, **kwargs)
        502                 f"- or '{pretrained_model_name_or_path}' is the correct path to a directory containing a {CONFIG_NAME} file\n\n"
        503             )
    --> 504             raise EnvironmentError(msg)
        505 
        506         except json.JSONDecodeError:
    
    OSError: Can't load config for 'prithivida/grammar_error_correcter'. Make sure that:
    
    - 'prithivida/grammar_error_correcter' is a correct model identifier listed on 'https://huggingface.co/models'
    
    - or 'prithivida/grammar_error_correcter' is the correct path to a directory containing a config.json file
    ![Screenshot from 2021-07-01 18-36-07](https://user-images.githubusercontent.com/4704211/124133526-5a9da900-da9b-11eb-9733-61df46ab01e1.png)
    
    

    Possible Solution:

    Rename this link from: https://huggingface.co/prithivida/grammar_error_correcter/ to: https://huggingface.co/prithivida/grammar_error_correcter_v1/

    Please help me fix this. thank you

    opened by Nomiluks 1
  • Inference Issue !!!

    Inference Issue !!!

    OSError Traceback (most recent call last)

    /usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py in get_config_dict(cls, pretrained_model_name_or_path, **kwargs) 241 if resolved_config_file is None: --> 242 raise EnvironmentError 243 config_dict = cls._dict_from_json_file(resolved_config_file)

    OSError:

    During handling of the above exception, another exception occurred:

    OSError Traceback (most recent call last)

    3 frames

    in () ----> 1 correction_tokenizer = AutoTokenizer.from_pretrained("prithivida/grammar_error_correcter") 2 correction_model = AutoModelForSeq2SeqLM.from_pretrained("prithivida/grammar_error_correcter") 3 print("[Gramformer] Grammar error correction model loaded..") 4 5

    /usr/local/lib/python3.7/dist-packages/transformers/tokenization_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs) 204 config = kwargs.pop("config", None) 205 if not isinstance(config, PretrainedConfig): --> 206 config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs) 207 208 if "bert-base-japanese" in str(pretrained_model_name_or_path):

    /usr/local/lib/python3.7/dist-packages/transformers/configuration_auto.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs) 201 202 """ --> 203 config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs) 204 205 if "model_type" in config_dict:

    /usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py in get_config_dict(cls, pretrained_model_name_or_path, **kwargs) 249 f"- or '{pretrained_model_name_or_path}' is the correct path to a directory containing a {CONFIG_NAME} file\n\n" 250 ) --> 251 raise EnvironmentError(msg) 252 253 except json.JSONDecodeError:

    OSError: Can't load config for 'prithivida/grammar_error_correcter'. Make sure that:

    • 'prithivida/grammar_error_correcter' is a correct model identifier listed on 'https://huggingface.co/models'

    • or 'prithivida/grammar_error_correcter' is the correct path to a directory containing a config.json file

    Solutions for this issue????

    invalid 
    opened by sabhi27 1
  • How to train Gramformer on non-English languages.

    How to train Gramformer on non-English languages.

    Hey @PrithivirajDamodaran , Great work on building Gramformer, ive played with it and the results are amazing.

    I work on pushing nlp forward in under represented languages, and hence i humbly request you to please tell me how do i train gramformer on non-English sentences ?

    I checked out your HuggingFace page 'https://huggingface.co/prithivida/grammar_error_correcter' but coudn't find any resources on how to train gramformer from scratch. If you could help me in training Gramformer on non-English langauages it would really mean a lot to me. Do let me know.

    Thanks

    question 
    opened by StephennFernandes 1
  • pip install is erroring out,

    pip install is erroring out,

    I am unable to do pip install of the package, here is the error:

    Collecting git+https://github.com/PrithivirajDamodaran/[email protected] Cloning https://github.com/PrithivirajDamodaran/Gramformer.git (to revision v0.1) to c:\users\sumit\appdata\local\temp\pip-req-build-sw54k_0h ERROR: Error [WinError 2] The system cannot find the file specified while executing command git clone -q https://github.com/PrithivirajDamodaran/Gramformer.git 'C:\Users\Sumit\AppData\Local\Temp\pip-req-build-sw54k_0h' ERROR: Cannot find command 'git' - do you have 'git' installed and in your PATH?

    I also tried directly downloading the repo and tried executing the package. Model is not present in location(correction_model_tag = "prithivida/grammar_error_correcter"). Any way to download the pretrain model.

    opened by ranjan-sumit 1
  • OSError: [E050] Can't find model 'en'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

    OSError: [E050] Can't find model 'en'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

    OSError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_9376\2706950954.py in 25 26 ---> 27 gf = Gramformer(models = 1, use_gpu=False) # 1=corrector, 2=detector 28 29 influent_sentences = [

    ~\anaconda3_9\envs\python37\lib\site-packages\gramformer\gramformer.py in init(self, models, use_gpu) 7 import errant 8 #self.annotator = errant.load('en_core_web_sm') ----> 9 self.annotator = errant.load('en') # en is deprecated from spacy 3.0 onwards 10 11 if use_gpu:

    ~\anaconda3_9\envs\python37\lib\site-packages\errant_init_.py in load(lang, nlp) 17 18 # Load spacy ---> 19 nlp = nlp or spacy.load(lang, disable=["ner"]) 20 21 # Load language edit merger

    ~\anaconda3_9\envs\python37\lib\site-packages\spacy_init_.py in load(name, **overrides) 28 if depr_path not in (True, False, None): 29 warnings.warn(Warnings.W001.format(path=depr_path), DeprecationWarning) ---> 30 return util.load_model(name, **overrides) 31 32

    ~\anaconda3_9\envs\python37\lib\site-packages\spacy\util.py in load_model(name, **overrides) 173 elif hasattr(name, "exists"): # Path or Path-like to model data 174 return load_model_from_path(name, **overrides) --> 175 raise IOError(Errors.E050.format(name=name)) 176 177

    OSError: [E050] Can't find model 'en'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

    opened by vky2998 2
  • Word limit

    Word limit

    The model is having trouble with long sentences. Specially if the words in the sentences are in upper case. It outputs only limited sentence as an output and the rest neglected sentence is shown as error.

    opened by Talib6509 0
  • Gramformer Highlight function not working

    Gramformer Highlight function not working

    Hello... I'm trying to get the edits between two sentences, but the highlight function is not working. Has anybody faced the same issue? Many thanks in advance

    opened by NourAlMerey 0
  • Suggestions to improve the grammar results for short sentences

    Suggestions to improve the grammar results for short sentences

    Hello..!

    I have used Gramformer model and I think this could be quite useful for checking and correcting some grammar points, especially for correcting singular/plural, verb forms and tenses, and spelling. However, some other grammar points (like correcting sentence structure, comparative/superlative forms, pronoun cases, etc.) seem to be still tricky.

    Note: I need to use the model on short sentences.

    The biggest challenge I faced in my case is: (Please suggest how to avoid it or improve it or changing some parameters...) 1 - Since it corrects grammar by generating text, most of the time it completely changes the sentence and rephrase it. How can we avoid this.

    whose bags you can bring? --> Which bags you can bring? (Just a sample, and sometime it generates totally changed verbose sentence)

    2 - Every time I give the same sentence as input, it generates different outputs:

    I go can there: three outputs in three different run ("I go, there"., "can I go there?", "I go back there.")

    Thanks!

    opened by muzamil47 0
Releases(v1.4)
  • v1.4(Aug 10, 2021)

    ⚡️ Features added/changed

    ✅ Correct API uses a ranker to sort good quality corrections. ✅ Highlight API returns sents w/errors marked up as readable tags. ✅ Edit API returns error types, positions, and respective corrections. ✅ The latest model checkpoint has been refreshed w/more data.

    License update to MIT.

    Source code(tar.gz)
    Source code(zip)
Owner
Prithivida
Applied NLP, XAI for NLP and Data Engineering
Prithivida
Tool for pinpointing circular imports in Python. Find cyclic imports in any project

Pycycle: Find and fix circular imports in python projects Pycycle is an experimental project that aims to help python developers fix their circular de

Vadim Kravcenko 311 Dec 15, 2022
Typed interface stubs for Pythonista iOS

Pythonista Stubs Stubs for the Pythonista iOS API. This allows for better error detection and IDE / editor autocomplete. Installation and Usage pip in

Harold Martin 12 Jul 14, 2020
Code audit tool for python.

Pylama Code audit tool for Python and JavaScript. Pylama wraps these tools: pycodestyle (formerly pep8) © 2012-2013, Florent Xicluna; pydocstyle (form

Kirill Klenov 967 Jan 07, 2023
Run isort, pyupgrade, mypy, pylint, flake8, and more on Jupyter Notebooks

Run isort, pyupgrade, mypy, pylint, flake8, mdformat, black, blacken-docs, and more on Jupyter Notebooks ✅ handles IPython magics robustly ✅ respects

663 Jan 08, 2023
flake8 plugin to run black for checking Python coding style

flake8-black Introduction This is an MIT licensed flake8 plugin for validating Python code style with the command line code formatting tool black. It

Peter Cock 146 Dec 15, 2022
flake8 plugin which checks that typing imports are properly guarded

flake8-typing-imports flake8 plugin which checks that typing imports are properly guarded installation pip install flake8-typing-imports flake8 codes

Anthony Sottile 50 Nov 01, 2022
A simple program which checks Python source files for errors

Pyflakes A simple program which checks Python source files for errors. Pyflakes analyzes programs and detects various errors. It works by parsing the

Python Code Quality Authority 1.2k Dec 30, 2022
An extension for flake8 that forbids some imports statements in some modules.

flake8-obey-import-goat An extension for flake8 that forbids some imports statements in some modules. Important: this project is developed using DDD,

Ilya Lebedev 10 Nov 09, 2022
Mylint - My really simple rendition of how a linter works.

mylint My really simple rendition of how a linter works. This original version was written for my AST article. Since then I've added tests and turned

Tushar Sadhwani 2 Dec 29, 2021
👻 Phantom types for Python

phantom-types Phantom types for Python will help you make illegal states unrepresentable and avoid shotgun parsing by enabling you to practice "Parse,

Anton Agestam 118 Dec 22, 2022
Flake8 wrapper to make it nice, legacy-friendly, configurable.

THE PROJECT IS ARCHIVED Forks: https://github.com/orsinium/forks It's a Flake8 wrapper to make it cool. Lint md, rst, ipynb, and more. Shareable and r

Life4 232 Dec 16, 2022
MonkeyType as a pytest plugin.

MonkeyType as a pytest plugin.

Marius van Niekerk 36 Nov 24, 2022
Pylint plugin for improving code analysis for when using Django

pylint-django About pylint-django is a Pylint plugin for improving code analysis when analysing code using Django. It is also used by the Prospector t

Python Code Quality Authority 544 Jan 06, 2023
OpenStack Hacking Style Checks. Mirror of code maintained at opendev.org.

Introduction hacking is a set of flake8 plugins that test and enforce the OpenStack StyleGuide Hacking pins its dependencies, as a new release of some

Mirrors of opendev.org/openstack 224 Jan 05, 2023
mypy plugin to type check Kubernetes resources

kubernetes-typed mypy plugin to dynamically define types for Kubernetes objects. Features Type checking for Custom Resources Type checking forkubernet

Artem Yarmoliuk 16 Oct 10, 2022
Stubs with type annotations for ordered-set Python library

ordered-set-stubs - stubs with type annotations for ordered-set Python library Archived - now type annotations are the part of the ordered-set library

Roman Inflianskas 2 Feb 06, 2020
A static type analyzer for Python code

pytype - 🦆 ✔ Pytype checks and infers types for your Python code - without requiring type annotations. Pytype can: Lint plain Python code, flagging c

Google 4k Dec 31, 2022
Custom Python linting through AST expressions

bellybutton bellybutton is a customizable, easy-to-configure linting engine for Python. What is this good for? Tools like pylint and flake8 provide, o

H. Chase Stevens 249 Dec 31, 2022
❄️ A flake8 plugin to help you write better list/set/dict comprehensions.

flake8-comprehensions A flake8 plugin that helps you write better list/set/dict comprehensions. Requirements Python 3.6 to 3.9 supported. Installation

Adam Johnson 398 Dec 23, 2022
A framework for detecting, highlighting and correcting grammatical errors on natural language text.

Gramformer Human and machine generated text often suffer from grammatical and/or typographical errors. It can be spelling, punctuation, grammatical or

Prithivida 1.3k Jan 08, 2023