An open-source NLP research library, built on PyTorch.

Last update: Jan 01, 2023

Overview

An Apache 2.0 NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks.

Quick Links

Website
Guide
Documentation ( latest | stable | commit )
Upgrade Guide from 1.x to 2.0
Stack Overflow
Contributing Guidelines
Officially Supported Models
- Pretrained Models
- Documentation ( latest | stable | commit )
Continuous Build
Nightly Releases

Getting Started Using the Library

If you're interested in using AllenNLP for model development, we recommend you check out the AllenNLP Guide. When you're ready to start your project, we've created a couple of template repositories that you can use as a starting place:

If you want to use allennlp train and config files to specify experiments, use this template. We recommend this approach.
If you'd prefer to use python code to configure your experiments and run your training loop, use this template. There are a few things that are currently a little harder in this setup (loading a saved model, and using distributed training), but otherwise it's functionality equivalent to the config files setup.

In addition, there are external tutorials:

And others on the AI2 AllenNLP blog.

Plugins

AllenNLP supports loading "plugins" dynamically. A plugin is just a Python package that provides custom registered classes or additional allennlp subcommands.

There is ecosystem of open source plugins, some of which are maintained by the AllenNLP team here at AI2, and some of which are maintained by the broader community.

Plugin	Maintainer	CLI	Description
allennlp-models	AI2	No	A collection of state-of-the-art models
allennlp-semparse	AI2	No	A framework for building semantic parsers
allennlp-server	AI2	Yes	A simple demo server for serving models
allennlp-optuna	Makoto Hiramatsu	Yes	Optuna integration for hyperparameter optimization

AllenNLP will automatically find any official AI2-maintained plugins that you have installed, but for AllenNLP to find personal or third-party plugins you've installed, you also have to create either a local plugins file named .allennlp_plugins in the directory where you run the allennlp command, or a global plugins file at ~/.allennlp/plugins. The file should list the plugin modules that you want to be loaded, one per line.

To test that your plugins can be found and imported by AllenNLP, you can run the allennlp test-install command. Each discovered plugin will be logged to the terminal.

For more information about plugins, see the plugins API docs. And for information on how to create a custom subcommand to distribute as a plugin, see the subcommand API docs.

Package Overview

allennlp	An open-source NLP research library, built on PyTorch
allennlp.commands	Functionality for the CLI
allennlp.common	Utility modules that are used across the library
allennlp.data	A data processing module for loading datasets and encoding strings as integers for representation in matrices
allennlp.modules	A collection of PyTorch modules for use with text
allennlp.nn	Tensor utility functions, such as initializers and activation functions
allennlp.training	Functionality for training models

Installation

AllenNLP requires Python 3.6.1 or later and PyTorch. It's recommended that you install the PyTorch ecosystem before installing AllenNLP by following the instructions on pytorch.org.

The preferred way to install AllenNLP is via pip. Just run pip install allennlp.

⚠️ If you're using Python 3.7 or greater, you should ensure that you don't have the PyPI version of dataclasses installed after running the above command, as this could cause issues on certain platforms. You can quickly check this by running pip freeze | grep dataclasses. If you see something like dataclasses=0.6 in the output, then just run pip uninstall -y dataclasses.

If you need pointers on setting up an appropriate Python environment or would like to install AllenNLP using a different method, see below.

We support AllenNLP on Mac and Linux environments. We presently do not support Windows but are open to contributions.

Installing via pip

Setting up a virtual environment

Conda can be used set up a virtual environment with the version of Python required for AllenNLP. If you already have a Python 3 environment you want to use, you can skip to the 'installing via pip' section.

Download and install Conda.
Create a Conda environment with Python 3.7 (3.6 or 3.8 would work as well):
```
conda create -n allennlp python=3.7
```
Activate the Conda environment. You will need to activate the Conda environment in each terminal in which you want to use AllenNLP:
```
conda activate allennlp
```

Installing the library and dependencies

Installing the library and dependencies is simple using pip.

pip install allennlp

Looking for bleeding edge features? You can install nightly releases directly from pypi

AllenNLP installs a script when you install the python package, so you can run allennlp commands just by typing allennlp into a terminal. For example, you can now test your installation with allennlp test-install.

You may also want to install allennlp-models, which contains the NLP constructs to train and run our officially supported models, many of which are hosted at https://demo.allennlp.org.

pip install allennlp-models

Installing using Docker

Docker provides a virtual machine with everything set up to run AllenNLP-- whether you will leverage a GPU or just run on a CPU. Docker provides more isolation and consistency, and also makes it easy to distribute your environment to a compute cluster.

AllenNLP provides official Docker images with the library and all of its dependencies installed.

Once you have installed Docker, you should also install the NVIDIA Container Toolkit if you have GPUs available.

Then run the following command to get an environment that will run on GPU:

mkdir -p $HOME/.allennlp/
docker run --rm --gpus all -v $HOME/.allennlp:/root/.allennlp allennlp/allennlp:latest

You can test the Docker environment with

docker run --rm --gpus all -v $HOME/.allennlp:/root/.allennlp allennlp/allennlp:latest test-install

If you don't have GPUs available, just omit the --gpus all flag.

Building your own Docker image

For various reasons you may need to create your own AllenNLP Docker image, such as if you need a different version of PyTorch. To do so, just run make docker-image from the root of your local clone of AllenNLP.

By default this builds an image with the tag allennlp/allennlp, but you can change this to anything you want by setting the DOCKER_TAG flag when you call make. For example, make docker-image DOCKER_TAG=my-allennlp.

If you want to use a different version of PyTorch, set the flag DOCKER_TORCH_VERSION to something like torch==1.7.0 or torch==1.7.0+cu110 -f https://download.pytorch.org/whl/torch_stable.html. The value of this flag will passed directly to pip install.

After building the image you should be able to see it listed by running docker images allennlp.

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
allennlp/allennlp   latest              b66aee6cb593        5 minutes ago       2.38GB

Installing from source

You can also install AllenNLP by cloning our git repository:

git clone https://github.com/allenai/allennlp.git

Create a Python 3.7 or 3.8 virtual environment, and install AllenNLP in editable mode by running:

pip install --editable .
pip install -r dev-requirements.txt

This will make allennlp available on your system but it will use the sources from the local clone you made of the source repository.

You can test your installation with allennlp test-install. See https://github.com/allenai/allennlp-models for instructions on installing allennlp-models from source.

Running AllenNLP

Once you've installed AllenNLP, you can run the command-line interface with the allennlp command (whether you installed from pip or from source). allennlp has various subcommands such as train, evaluate, and predict. To see the full usage information, run allennlp --help.

You can test your installation by running allennlp test-install.

Issues

Everyone is welcome to file issues with either feature requests, bug reports, or general questions. As a small team with our own internal goals, we may ask for contributions if a prompt fix doesn't fit into our roadmap. To keep things tidy we will often close issues we think are answered, but don't hesitate to follow up if further discussion is needed.

Contributions

The AllenNLP team at AI2 (@allenai) welcomes contributions from the greater AllenNLP community, and, if you would like to get a change into the library, this is likely the fastest approach. If you would like to contribute a larger feature, we recommend first creating an issue with a proposed design for discussion. This will prevent you from spending significant time on an implementation which has a technical limitation someone could have pointed out early on. Small contributions can be made directly in a pull request.

Pull requests (PRs) must have one approving review and no requested changes before they are merged. As AllenNLP is primarily driven by AI2 (@allenai) we reserve the right to reject or revert contributions that we don't think are good additions.

Citing

If you use AllenNLP in your research, please cite AllenNLP: A Deep Semantic Natural Language Processing Platform.

@inproceedings{Gardner2017AllenNLP,
  title={AllenNLP: A Deep Semantic Natural Language Processing Platform},
  author={Matt Gardner and Joel Grus and Mark Neumann and Oyvind Tafjord
    and Pradeep Dasigi and Nelson F. Liu and Matthew Peters and
    Michael Schmitz and Luke S. Zettlemoyer},
  year={2017},
  Eprint = {arXiv:1803.07640},
}

Team

AllenNLP is an open-source project backed by the Allen Institute for Artificial Intelligence (AI2). AI2 is a non-profit institute with the mission to contribute to humanity through high-impact AI research and engineering. To learn more about who specifically contributed to this codebase, see our contributors page.

Comments

Add support for pretrained embedding extension in fine-tuning.
@matt-gardner , I am working on the final piece (follow up to #2387) - if pretrained file was used in the Embedding construction (training), use the same for extension in fine-tuning. This is rough yet, and seems kind-of hacky. But if high level approach seems reasonable, I can refactor things.

Notes:

For this to work, we need to access the pretrained-embedding-file used by Embedding during training, in the Embedding during fine-tunng.

First, for embedding params having pretrained_file, I make it default to add it in files_to_archive dict. This is necessary for the files to be available at the time of fine-tuning.

Second, during the training, I store the used pretrained_file as an attribute in the instance (similar to what we did with vocab_namespace in previous PR).

However, this isn't enough because file named _pretrained_file won't be available at same location at time of fine-tuning, instead would be in serialization dir fta as saved by archive_model function. So to fix this, I make a mapping of orginal filename (used during training) to replacement filename (available in serialization fta during fine-tuning) and allow it to pass in extend_vocab calls.
opened by HarshTrivedi 53
Consider removing CallbackTrainer

The callback trainer hasn't really worked, because callbacks that we've tried to add have required setting state on the CallbackTrainer itself, which makes them hard to add. Given this, we are just maintaining 2 Trainers unnecessarily, which slows us down.

Happy to hear reasons why we should keep the callback trainer/ if people have found it particularly useful!

opened by DeNeutoy 44
add BERT token embedder

this is ready for review. in addition to the included unit tests, I trained two NER models using these embeddings (unfortunately, I realized this morning, I used the uncased BERT model, which seems like a bad idea for NER)

(1) only BERT embeddings: https://beaker-internal.allenai.org/ex/ex_rnk3mcplnpjz/tasks (2) BERT embeddings + character embeddings: https://beaker-internal.allenai.org/ex/ex_nrq8d5vw5cb2/tasks

(apologies to non-AI2 people for the beaker-internal links)

as discussed offline, because of the positional encodings the BERT embedding has a max sequence length and will crash if you feed it longer sequences. this implementation simply truncates longer sequences and logs a warning. I left a TODO to come up with something better.

opened by joelgrus 39
Shuffling + bucketing are incompatible with lazy dataset reading
System (please complete the following information):

OS: OS X

Python version: 3.6.10

AllenNLP version: hash 4749fc3

PyTorch version: (if you installed it yourself): 1.4.0

Question Is it possible to shuffle a lazily-read dataset with the new dataloaders?

Hi!

I'm working with a dataset that won't fit in main memory, so I'd like to lazily read it. However, it seems like the PyTorch DataLoader (understandably) gets mad at me when I set "lazy": true while still providing a BatchSampler:

ValueError: DataLoader with IterableDataset: expected unspecified batch_sampler option, but got batch_sampler=<allennlp.data.samplers.$ucket_batch_sampler.BucketBatchSampler object at 0x7f70493ba820>

So I removed the BatchSampler and set "shuffle": true in the data loader, but it also complains that:

ValueError: DataLoader with IterableDataset: expected unspecified shuffle option, but got shuffle=True

I guess this makes sense, since the loader doesn't get random access to the dataset. Is there any way to lazily read and still shuffle data? I'm not sure if this was always the behavior, or if this is new with the new dataloaders...can someone remind me?
opened by nelson-liu 36
Seq2Seq model decomposition

Hi, this is work-in-progress pull request for my attempt to decompose monolith seq2seq and enable generic decoder module. Initial issue discussion could be found here https://github.com/allenai/allennlp/issues/2097 There is also work notes on this here https://github.com/epwalsh/allennlp/pull/3 I am looking for some feedback on my changes. One of the questions is how should we support backward compatibility since module parameters changed.

opened by generall 36

Filter Warnings when pytest

fix #1672

By default pytest will display some warnings from user code and third-party libraries, as recommended by PEP-0506. This helps users keep their code modern and avoid breakages when deprecated warnings are effectively removed.

from pytest.

There still remain some warnings:

/usr/local/lib/python3.6/site-packages/nbconvert/exporters/exporter_locator.py:28
[02:56:20][Step 3/10]   /usr/local/lib/python3.6/site-packages/nbconvert/exporters/exporter_locator.py:28: DeprecationWarning: `nbconvert.exporters.exporter_locator` is deprecated in favor of `nbconvert.exporters.base` since nbconvert 5.0.
[02:56:20][Step 3/10]     DeprecationWarning)
[02:56:20][Step 3/10] 
[02:56:20][Step 3/10] /usr/local/lib/python3.6/site-packages/tornado/web.py:1747
[02:56:20][Step 3/10]   /usr/local/lib/python3.6/site-packages/tornado/web.py:1747: DeprecationWarning: @asynchronous is deprecated, use coroutines instead
[02:56:20][Step 3/10]     DeprecationWarning)
[02:56:20][Step 3/10] 
[02:56:20][Step 3/10] allennlp/data/token_indexers/openai_transformer_byte_pair_indexer.py:25
[02:56:20][Step 3/10]   /local/deploy/agent3/work/98197cf33cb401e5/allennlp/data/token_indexers/openai_transformer_byte_pair_indexer.py:25: DeprecationWarning: invalid escape sequence \?
[02:56:20][Step 3/10]     text = re.sub('''(-+|~+|!+|"+|;+|\?+|\++|,+|\)+|\(+|\\+|\/+|\*+|\[+|\]+|}+|{+|\|+|_+)''', r' \1 ', text)
...
[02:56:20][Step 3/10] allennlp/tests/predictors/srl_test.py::TestSrlPredictor::test_uses_named_inputs
[02:56:20][Step 3/10]   /usr/local/bin/pytest:11: DeprecationWarning: [W002] Tokenizer.from_list is now deprecated. Create a new Doc object instead and pass in the strings as the `words` keyword argument, for example:
[02:56:20][Step 3/10]   from spacy.tokens import Doc
[02:56:20][Step 3/10]   doc = Doc(nlp.vocab, words=[...])
[02:56:20][Step 3/10]     sys.exit(main())
...
[02:56:20][Step 3/10] allennlp/tests/semparse/worlds/text2sql_world_test.py::TestText2SqlWorld::test_variable_free_world_cannot_parse_as_statements
[02:56:20][Step 3/10]   <unknown>:1: DeprecationWarning: invalid escape sequence \s
[02:56:20][Step 3/10]   <unknown>:1: DeprecationWarning: invalid escape sequence \s
[02:56:20][Step 3/10]   <unknown>:1: DeprecationWarning: invalid escape sequence \d

which is possibly related to

I think it is better to fix it (using new methods instead of deprecated ones), but I have no idea about DeprecationWarning: nbconvert.exporters.exporter_locator is deprecated in favor of... and /usr/local/lib/python3.6/site-packages/tornado/web.py:1747: DeprecationWarning: @asynchronous is deprecated, use coroutines instead.

opened by WrRan 34

Train a model with transformer embeddings and additional_special_tokens

Checklist

[x] I have verified that the issue exists against the master branch of AllenNLP.
[x] I have read the relevant section in the contribution guide on reporting bugs.
[x] I have checked the issues list for similar or identical bug reports.
[x] I have checked the pull requests list for existing proposed fixes.
[x] I have checked the CHANGELOG and the commit log to find out if the bug was already fixed in the master branch.
[x] I have included in the "Description" section below a traceback from any exceptions related to this bug.
[x] I have included in the "Related issues or possible duplicates" section beloew all related issues and possible duplicate issues (If there are none, check this box anyway).
[x] I have included in the "Environment" section below the name of the operating system and Python version that I was using when I discovered this bug.
[x] I have included in the "Environment" section below the output of pip freeze.
[x] I have included in the "Steps to reproduce" section below a minimally reproducible example.

Description

Hi there! I'm trying to train a transformer-based text classifier model in AllenNLP, but I need to add 5 additional special tokens, in a way compatible with tokenizers lib. I tried adding them to the jsonnet AllenNLP config file and then to the transformer's model path, but neither worked, with each approach having a different problem, which will be described below.

Python traceback:

2020-09-30 23:56:17,398 - INFO - allennlp.training.trainer - Epoch 0/9
2020-09-30 23:56:17,398 - INFO - allennlp.training.trainer - Worker 0 memory usage MB: 10065.304
2020-09-30 23:56:17,484 - WARNING - allennlp.common.util - unable to check gpu_memory_mb() due to occasional failure, continuing
Traceback (most recent call last):
  File "/media/discoD/repositorios/allennlp/allennlp/common/util.py", line 415, in gpu_memory_mb
    encoding="utf-8",
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/subprocess.py", line 411, in check_output
    **kwargs).stdout
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/subprocess.py", line 488, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/subprocess.py", line 800, in __init__
    restore_signals, start_new_session)
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/subprocess.py", line 1482, in _execute_child
    restore_signals, start_new_session, preexec_fn)
  File "/media/discoD/pycharm-community-2019.2/plugins/python-ce/helpers/pydev/_pydev_bundle/pydev_monkey.py", line 526, in new_fork_exec
    return getattr(_posixsubprocess, original_name)(args, *patch_fork_exec_executable_list(args, other_args))
OSError: [Errno 12] Cannot allocate memory
2020-09-30 23:56:17,489 - INFO - allennlp.training.trainer - Training
  0%|          | 0/11817 [00:00<?, ?it/s]/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [69,0,0], thread: [32,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [69,0,0], thread: [33,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [69,0,0], thread: [34,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [69,0,0], thread: [35,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
...
...
...
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [102,0,0], thread: [30,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [102,0,0], thread: [31,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
  0%|          | 0/11817 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/media/discoD/repositorios/allennlp/allennlp/commands/train.py", line 443, in _train_worker
    metrics = train_loop.run()
  File "/media/discoD/repositorios/allennlp/allennlp/commands/train.py", line 505, in run
    return self.trainer.train()
  File "/media/discoD/repositorios/allennlp/allennlp/training/trainer.py", line 872, in train
    train_metrics = self._train_epoch(epoch)
  File "/media/discoD/repositorios/allennlp/allennlp/training/trainer.py", line 594, in _train_epoch
    batch_outputs = self.batch_outputs(batch, for_training=True)
  File "/media/discoD/repositorios/allennlp/allennlp/training/trainer.py", line 479, in batch_outputs
    output_dict = self._pytorch_model(**batch)
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/media/discoD/repositorios/allennlp/allennlp/models/basic_classifier.py", line 121, in forward
    embedded_text = self._text_field_embedder(tokens)
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/media/discoD/repositorios/allennlp/allennlp/modules/text_field_embedders/basic_text_field_embedder.py", line 88, in forward
    token_vectors = embedder(**tensors, **forward_params_values)
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/media/discoD/repositorios/allennlp/allennlp/modules/token_embedders/pretrained_transformer_embedder.py", line 184, in forward
    transformer_output = self.transformer_model(**parameters)
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/transformers/modeling_bert.py", line 762, in forward
    output_hidden_states=output_hidden_states,
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/transformers/modeling_bert.py", line 439, in forward
    output_attentions,
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/transformers/modeling_bert.py", line 371, in forward
    hidden_states, attention_mask, head_mask, output_attentions=output_attentions,
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/transformers/modeling_bert.py", line 315, in forward
    hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask, output_attentions,
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/transformers/modeling_bert.py", line 221, in forward
    mixed_query_layer = self.query(hidden_states)
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 91, in forward
    return F.linear(input, self.weight, self.bias)
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/functional.py", line 1676, in linear
    output = input.matmul(weight.t())
RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`
python-BaseException
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCCachingHostAllocator.cpp line=278 error=710 : device-side assert triggered

Related issues or possible duplicates

None

Environment

OS: Linux

Python version: 3.7.7

Output of pip freeze:

allennlp==1.1.0
allennlp-models==1.1.0
-e [email protected]:allenai/[email protected]#egg=allennlp_server
attrs==19.3.0
backcall==0.2.0
bleach==3.1.5
blis==0.4.1
boto3==1.14.31
botocore==1.17.31
cachetools==4.1.1
catalogue==1.0.0
certifi==2020.6.20
chardet==3.0.4
click==7.1.2
conllu==4.1
cycler==0.10.0
cymem==2.0.3
cytoolz==0.10.1
decorator==4.4.2
defusedxml==0.6.0
docutils==0.15.2
eland==7.7.0a1
elasticsearch-dsl==7.2.1
en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.3.1/en_core_web_sm-2.3.1.tar.gz
entrypoints==0.3
filelock==3.0.12
fire==0.3.1
Flask==1.1.2
Flask-Cors==3.0.8
ftfy==5.8
future==0.18.2
gevent==20.6.2
greenlet==0.4.16
h5py==2.10.0
idna==2.10
importlib-metadata==1.7.0
iniconfig==1.0.1
ipykernel==5.3.4
ipython==7.16.1
ipython-genutils==0.2.0
ipywidgets==7.5.1
itsdangerous==1.1.0
jedi==0.17.2
jellyfish==0.8.2
Jinja2==2.11.2
jmespath==0.10.0
joblib==0.16.0
jsonnet==0.16.0
jsonpickle==1.4.1
jsonschema==3.2.0
jupyter-client==6.1.6
jupyter-core==4.6.3
Keras==2.4.3
kiwisolver==1.2.0
MarkupSafe==1.1.1
matplotlib==3.3.0
mistune==0.8.4
mkl-fft==1.1.0
mkl-random==1.1.1
mkl-service==2.3.0
more-itertools==8.4.0
murmurhash==1.0.2
nbconvert==5.6.1
nbformat==5.0.7
networkx==2.4
nltk==3.5
notebook==6.0.3
numpy==1.18.5
olefile==0.46
overrides==3.1.0
packaging==20.4
pandas==1.1.0
pandocfilters==1.4.2
parso==0.7.1
pexpect==4.8.0
pickleshare==0.7.5
Pillow==7.2.0
plac==1.1.3
pluggy==0.13.1
preshed==3.0.2
prometheus-client==0.8.0
prompt-toolkit==3.0.5
protobuf==3.12.4
ptyprocess==0.6.0
py==1.9.0
py-rouge==1.1
pydot==1.4.1
pyemd==0.5.1
Pygments==2.6.1
pyparsing==2.4.7
Pyphen==0.9.5
pyrsistent==0.16.0
pytest==6.0.1
python-dateutil==2.8.1
pytz==2020.1
PyYAML==5.3.1
pyzmq==19.0.1
regex==2020.7.14
requests==2.24.0
s3transfer==0.3.3
sacremoses==0.0.43
scikit-learn==0.23.1
scipy==1.5.2
seaborn==0.11.0
Send2Trash==1.5.0
sentencepiece==0.1.91
seqeval==0.0.12
six==1.15.0
spacy==2.3.2
srsly==1.0.2
tensorboardX==2.1
termcolor==1.1.0
terminado==0.8.3
testpath==0.4.4
thinc==7.4.1
threadpoolctl==2.1.0
tokenizers==0.8.1rc1
toml==0.10.1
toolz==0.10.0
torch==1.6.0+cu101
torchvision==0.7.0+cu101
tornado==6.0.4
tqdm==4.48.0
traitlets==4.3.3
transformers==3.0.2
urllib3==1.25.10
visualise-spacy-tree==0.0.6
wasabi==0.7.1
wcwidth==0.2.5
webencodings==0.5.1
Werkzeug==1.0.1
widgetsnbextension==3.5.1
word2number==1.1
zipp==3.1.0
zope.event==4.4
zope.interface==5.1.0

Steps to reproduce

First I tried adding the 5 additional special tokens directly in the jsonnet model config, like this:

    "token_indexers": {
            "tokens": {
                "type": "pretrained_transformer",
                "model_name": transformer_model,
                "max_length": transformer_dim,
                "tokenizer_kwargs": {"additional_special_tokens": [['<REL_SEP>'], ['[['], [']]'], ['<<'], ['>>']], "max_len": transformer_dim}
            }
     },

But I ran into a problem at allennlp.common.cached_transformer.get_tokenizer, because cache_key = (model_name, frozenset(kwargs.items())) tries to use the "tokenizer_kwargs" value as a cache key, but it can't parse the additional_special_tokens list into a string, throwing the following exception:

TypeError: unhashable type: 'list'

Traceback (most recent call last):
  File "/media/discoD/pycharm-community-2019.2/plugins/python-ce/helpers/pydev/pydevd.py", line 1465, in _exec
    runpy._run_module_as_main(module_name, alter_argv=False)
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/media/discoD/repositorios/allennlp/allennlp/__main__.py", line 38, in <module>
    run()
  File "/media/discoD/repositorios/allennlp/allennlp/__main__.py", line 34, in run
    main(prog="allennlp")
  File "/media/discoD/repositorios/allennlp/allennlp/commands/__init__.py", line 94, in main
    args.func(args)
  File "/media/discoD/repositorios/allennlp/allennlp/commands/train.py", line 118, in train_model_from_args
    file_friendly_logging=args.file_friendly_logging,
  File "/media/discoD/repositorios/allennlp/allennlp/commands/train.py", line 177, in train_model_from_file
    file_friendly_logging=file_friendly_logging,
  File "/media/discoD/repositorios/allennlp/allennlp/commands/train.py", line 238, in train_model
    file_friendly_logging=file_friendly_logging,
  File "/media/discoD/repositorios/allennlp/allennlp/commands/train.py", line 433, in _train_worker
    local_rank=process_rank,
  File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 599, in from_params
    **extras,
  File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 626, in from_params
    kwargs = create_kwargs(constructor_to_inspect, cls, params, **extras)
  File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 197, in create_kwargs
    cls.__name__, param_name, annotation, param.default, params, **extras
  File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 306, in pop_and_construct_arg
    return construct_arg(class_name, name, popped_params, annotation, default, **extras)
  File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 340, in construct_arg
    return annotation.from_params(params=popped_params, **subextras)
  File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 599, in from_params
    **extras,
  File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 626, in from_params
    kwargs = create_kwargs(constructor_to_inspect, cls, params, **extras)
  File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 197, in create_kwargs
    cls.__name__, param_name, annotation, param.default, params, **extras
  File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 306, in pop_and_construct_arg
    return construct_arg(class_name, name, popped_params, annotation, default, **extras)
  File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 387, in construct_arg
    **extras,
  File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 340, in construct_arg
    return annotation.from_params(params=popped_params, **subextras)
  File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 599, in from_params
    **extras,
  File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 628, in from_params
    return constructor_to_call(**kwargs)  # type: ignore
  File "/media/discoD/repositorios/allennlp/allennlp/data/token_indexers/pretrained_transformer_indexer.py", line 58, in __init__
    model_name, tokenizer_kwargs=tokenizer_kwargs
  File "/media/discoD/repositorios/allennlp/allennlp/data/tokenizers/pretrained_transformer_tokenizer.py", line 71, in __init__
    model_name, add_special_tokens=False, **tokenizer_kwargs
  File "/media/discoD/repositorios/allennlp/allennlp/common/cached_transformers.py", line 101, in get_tokenizer
    cache_key = (model_name, frozenset(kwargs.items()))
TypeError: unhashable type: 'list'

I couldn't find a way to work passing the tokens in this way, so I ended up downloading the bert model to my local disk and added the tokenizers config files to the same path (the vocab size of my bert model is 29794, so the last index is 29793). Files contents I changed are in the "Example source" section below.

After debugging, looks like this config at least was enough to get the bert tokenizer to recognize the 5 tokens and tokenize the training data accordingly, but then I ran into another issue once training actually began (the one pasted in the "Python traceback" section of this issue).

Looks like this error is due to the fact that the transformer's model embeddings layer weren't properly resized according to the new vocabulary size, which would be accomplished with a code like this: model.resize_token_embeddings(len(tokenizer)). I didn't find any code in the AllenNLP lib that would do something like this, so I'm thinking this is the issue's cause.

Is there another way to accomplish this using AllenNLP that I'm not aware of? Looks like both ways to expand the vocab size should be possible.

Example source:

added_tokens.json:

{"<REL_SEP>": 29794, "[[": 29795, "]]": 29796, "<<": 29797, ">>": 29798}

special_tokens_map.json:

{"unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]", "additional_special_tokens": ["<REL_SEP>", "[[", "]]", "<<", ">>"]}

tokenizer_config.json:

{"do_lower_case": false, "additional_special_tokens": ["<REL_SEP>", "[[", "]]", "<<", ">>"]}

Thanks!

bug Contributions welcome

opened by pvcastro 32

[Contribution] DeepSpeed Integration
DeepSpeed background

DeepSpeed is a distributed training engine for PyTorch, primarily for training very large language models with significantly less memory. For example, the 17.7 billion parameter Turing-NLG was trained with DeepSpeed's ZeRO optimizer.

Proposal

It seems like a natural fit to have a way to use this with AllenNLP for use with large, distributed experiments. It also shouldn't require any major changes to integrate. Their training loop looks like:

# https://www.deepspeed.ai/getting-started/#training model_engine, optimizer, _, _ = deepspeed.initialize(args=cmd_args, model=model, model_parameters=params) for step, batch in enumerate(data_loader): #forward() method loss = model_engine(batch) #runs backpropagation model_engine.backward(loss) #weight update model_engine.step()

In terms of where it would fit into the library, I think a standalone DeepSpeedTrainer(Trainer) subclass would make sense. It should be fairly similar to GradientDescentTrainer (minus stuff that DeepSpeed handles itself, like gradient accumulation). It could then be initialized from a config file by the user as per usual.

I know not having dependencies on other libraries is a point of emphasis. It should be possible to include without adding deepspeed as a dependency and allowing the user to install it independently by doing something like:

# allennlp.training.__init__.py # ... try: from allennlp.training.deepspeed_trainer import DeepSpeedTrainer except ImportError: pass # maybe a warning here or something

Initial results

I was able to get a prototype up and running pretty easily. I didn't subclass GradientDescentTrainer (I had a lot of trouble doing that, for whatever reason), but I just copied and pasted the code and started ripping stuff out as I went.

I setup a training experiment for a basic classifier on the first 10k instances of SST using RoBERTA-base across two GPUs. The GradientDescentTrainer completed an epoch in 20.40s, using 8936MB / 10202MB of GPU memory. The DeepSpeedTrainer prototype completed an epoch in 46.91s, using just 4184MB / 4348MB of GPU memory (less than half!). I don't know why it took so much longer but I strongly assume it's something I implemented wrong myself.

The repo for this prototype is here.

Potential obstacles

I have plenty of time to implement this if would be a useful addition, but I have little to no idea what I'm doing, I'm not particularly experienced with heavy distributed training stuff.

With all due respect, their library could maybe be a bit better documented and is seriously challenging to install and get everything compiled just right.

That said, a lot of the latter point might be a product of my setup. I'm working on SLURM instead of a personal VM which makes using their Docker image or installing from source harder than it really is.

Next steps

I think this could be a useful addition if (1) it's really halving GPU memory for transformer models and (2) it can be implemented non-intrusively. If you guys agree, I can move my prototype code from my repository into an actual PR.
Feature request
opened by jacobdanovitch 32

Installing allennlp through pip using conda virtual environment fails

Describe the bug

I am trying to install allennlp through pip under conda virtual environment, however it fails and leave error message like this:

Building wheels for collected packages: overrides, jsonnet, nltk, parsimonious, numpydoc, msgpack, regex, ujson, dill, jsondiff, PyYAML, wrapt, cytoolz, future, toolz
  Running setup.py bdist_wheel for overrides ... done
  Stored in directory: /home/ichn/.cache/pip/wheels/f7/27/b8/b4f46c59426a11e7f2d4e472b870ec14c21b4beab2e1afa725
  Running setup.py bdist_wheel for jsonnet ... error
  Complete output from command /home/ichn/anaconda3/envs/torch/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-grk1qblh/jsonnet/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-t66ososi --python-tag cp37:
  running bdist_wheel
  running build
  running build_ext
  g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 core/desugarer.cpp -o core/desugarer.o
  core/desugarer.cpp: In member function ‘void Desugarer::desugar(AST*&, unsigned int)’:
  core/desugarer.cpp:612:51: warning: this statement may fall through [-Wimplicit-fallthrough=]
                   case BOP_MANIFEST_UNEQUAL: invert = true;
                                              ~~~~~~~^~~~~~
  core/desugarer.cpp:613:17: note: here
                   case BOP_MANIFEST_EQUAL: {
                   ^~~~
  g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 core/formatter.cpp -o core/formatter.o
  g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 core/libjsonnet.cpp -o core/libjsonnet.o
  g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 core/lexer.cpp -o core/lexer.o
  g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 core/parser.cpp -o core/parser.o
  g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 core/pass.cpp -o core/pass.o
  g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 core/static_analysis.cpp -o core/static_analysis.o
  g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 core/string_utils.cpp -o core/string_utils.o
  g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 core/vm.cpp -o core/vm.o
  g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 third_party/md5/md5.cpp -o third_party/md5/md5.o
  building '_jsonnet' extension
  creating build
  creating build/temp.linux-x86_64-3.7
  creating build/temp.linux-x86_64-3.7/python
  gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Iinclude -Ithird_party/md5 -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c python/_jsonnet.c -o build/temp.linux-x86_64-3.7/python/_jsonnet.o
  python/_jsonnet.c: In function ‘cpython_native_callback’:
  python/_jsonnet.c:147:19: warning: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘const long unsigned int’} [-Wsign-compare]
       for (i = 0; i < ctx->argc; ++i) {
                     ^
  creating build/lib.linux-x86_64-3.7
  g++ -pthread -shared -B /home/ichn/anaconda3/envs/torch/compiler_compat -L/home/ichn/anaconda3/envs/torch/lib -Wl,-rpath=/home/ichn/anaconda3/envs/torch/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/python/_jsonnet.o core/desugarer.o core/formatter.o core/libjsonnet.o core/lexer.o core/parser.o core/pass.o core/static_analysis.o core/string_utils.o core/vm.o third_party/md5/md5.o -o build/lib.linux-x86_64-3.7/_jsonnet.cpython-37m-x86_64-linux-gnu.so
  /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/python/_jsonnet.o: unable to initialize decompress status for section .debug_info
  /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/python/_jsonnet.o: unable to initialize decompress status for section .debug_info
  /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/python/_jsonnet.o: unable to initialize decompress status for section .debug_info
  /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/python/_jsonnet.o: unable to initialize decompress status for section .debug_info
  build/temp.linux-x86_64-3.7/python/_jsonnet.o: file not recognized: file format not recognized
  collect2: error: ld returned 1 exit status
  error: command 'g++' failed with exit status 1
  
  ----------------------------------------
  Failed building wheel for jsonnet
  Running setup.py clean for jsonnet
  Running setup.py bdist_wheel for nltk ... done
  Stored in directory: /home/ichn/.cache/pip/wheels/f1/98/72/c2ba4734bc46df30b9c3bd3eb037c52ab8ae0110f8fa15200a
  Running setup.py bdist_wheel for parsimonious ... done
  Stored in directory: /home/ichn/.cache/pip/wheels/f1/a4/4b/7cac60fa74b7c16017cd9c67ab65736d3d9318064ae65e0ee0
  Running setup.py bdist_wheel for numpydoc ... done
  Stored in directory: /home/ichn/.cache/pip/wheels/11/76/d4/16c19c2378616c3389916bc6d7b1134b72bfe6f7abd9f80243
  Running setup.py bdist_wheel for msgpack ... done
  Stored in directory: /home/ichn/.cache/pip/wheels/3f/78/5a/92a8797deabe61189baf597a855e9529f6b20a391d9924d968
  Running setup.py bdist_wheel for regex ... error
  Complete output from command /home/ichn/anaconda3/envs/torch/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-grk1qblh/regex/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-6v75m_hn --python-tag cp37:
  /home/ichn/anaconda3/envs/torch/lib/python3.7/site-packages/setuptools/dist.py:470: UserWarning: Normalizing '2018.01.10' to '2018.1.10'
    normalized_version,
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-3.7
  copying regex_3/regex.py -> build/lib.linux-x86_64-3.7
  copying regex_3/_regex_core.py -> build/lib.linux-x86_64-3.7
  copying regex_3/test_regex.py -> build/lib.linux-x86_64-3.7
  running build_ext
  building '_regex' extension
  creating build/temp.linux-x86_64-3.7
  creating build/temp.linux-x86_64-3.7/regex_3
  gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c regex_3/_regex.c -o build/temp.linux-x86_64-3.7/regex_3/_regex.o
  gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c regex_3/_regex_unicode.c -o build/temp.linux-x86_64-3.7/regex_3/_regex_unicode.o
  gcc -pthread -shared -B /home/ichn/anaconda3/envs/torch/compiler_compat -L/home/ichn/anaconda3/envs/torch/lib -Wl,-rpath=/home/ichn/anaconda3/envs/torch/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/regex_3/_regex.o build/temp.linux-x86_64-3.7/regex_3/_regex_unicode.o -o build/lib.linux-x86_64-3.7/_regex.cpython-37m-x86_64-linux-gnu.so
  /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/regex_3/_regex.o: unable to initialize decompress status for section .debug_info
  /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/regex_3/_regex.o: unable to initialize decompress status for section .debug_info
  /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/regex_3/_regex.o: unable to initialize decompress status for section .debug_info
  /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/regex_3/_regex.o: unable to initialize decompress status for section .debug_info
  build/temp.linux-x86_64-3.7/regex_3/_regex.o: file not recognized: file format not recognized
  collect2: error: ld returned 1 exit status
  error: command 'gcc' failed with exit status 1
  
  ----------------------------------------
  Failed building wheel for regex
  Running setup.py clean for regex
  Running setup.py bdist_wheel for ujson ... error
  Complete output from command /home/ichn/anaconda3/envs/torch/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-grk1qblh/ujson/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-97buh6vk --python-tag cp37:
  Warning: 'classifiers' should be a list, got type 'filter'
  running bdist_wheel
  running build
  running build_ext
  building 'ujson' extension
  creating build
  creating build/temp.linux-x86_64-3.7
  creating build/temp.linux-x86_64-3.7/python
  creating build/temp.linux-x86_64-3.7/lib
  gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I./python -I./lib -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c ./python/ujson.c -o build/temp.linux-x86_64-3.7/./python/ujson.o -D_GNU_SOURCE
  gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I./python -I./lib -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c ./python/objToJSON.c -o build/temp.linux-x86_64-3.7/./python/objToJSON.o -D_GNU_SOURCE
  ./python/objToJSON.c: In function ‘PyUnicodeToUTF8’:
  ./python/objToJSON.c:154:18: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
       char *data = PyUnicode_AsUTF8AndSize(obj, &len);
                    ^~~~~~~~~~~~~~~~~~~~~~~
  gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I./python -I./lib -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c ./python/JSONtoObj.c -o build/temp.linux-x86_64-3.7/./python/JSONtoObj.o -D_GNU_SOURCE
  gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I./python -I./lib -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c ./lib/ultrajsonenc.c -o build/temp.linux-x86_64-3.7/./lib/ultrajsonenc.o -D_GNU_SOURCE
  ./lib/ultrajsonenc.c:156:23: warning: ‘g_hexChars’ is static but used in inline function ‘Buffer_AppendShortHexUnchecked’ which is not static
     *(outputOffset++) = g_hexChars[(value & 0x000f) >> 0];
                         ^~~~~~~~~~
  ./lib/ultrajsonenc.c:155:23: warning: ‘g_hexChars’ is static but used in inline function ‘Buffer_AppendShortHexUnchecked’ which is not static
     *(outputOffset++) = g_hexChars[(value & 0x00f0) >> 4];
                         ^~~~~~~~~~
  ./lib/ultrajsonenc.c:154:23: warning: ‘g_hexChars’ is static but used in inline function ‘Buffer_AppendShortHexUnchecked’ which is not static
     *(outputOffset++) = g_hexChars[(value & 0x0f00) >> 8];
                         ^~~~~~~~~~
  ./lib/ultrajsonenc.c:153:23: warning: ‘g_hexChars’ is static but used in inline function ‘Buffer_AppendShortHexUnchecked’ which is not static
     *(outputOffset++) = g_hexChars[(value & 0xf000) >> 12];
                         ^~~~~~~~~~
  gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I./python -I./lib -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c ./lib/ultrajsondec.c -o build/temp.linux-x86_64-3.7/./lib/ultrajsondec.o -D_GNU_SOURCE
  creating build/lib.linux-x86_64-3.7
  gcc -pthread -shared -B /home/ichn/anaconda3/envs/torch/compiler_compat -L/home/ichn/anaconda3/envs/torch/lib -Wl,-rpath=/home/ichn/anaconda3/envs/torch/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/./python/ujson.o build/temp.linux-x86_64-3.7/./python/objToJSON.o build/temp.linux-x86_64-3.7/./python/JSONtoObj.o build/temp.linux-x86_64-3.7/./lib/ultrajsonenc.o build/temp.linux-x86_64-3.7/./lib/ultrajsondec.o -o build/lib.linux-x86_64-3.7/ujson.cpython-37m-x86_64-linux-gnu.so
  /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/./python/ujson.o: unable to initialize decompress status for section .debug_info
  /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/./python/ujson.o: unable to initialize decompress status for section .debug_info
  /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/./python/ujson.o: unable to initialize decompress status for section .debug_info
  /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/./python/ujson.o: unable to initialize decompress status for section .debug_info
  build/temp.linux-x86_64-3.7/./python/ujson.o: file not recognized: file format not recognized
  collect2: error: ld returned 1 exit status
  error: command 'gcc' failed with exit status 1
  
  ----------------------------------------
  Failed building wheel for ujson
  Running setup.py clean for ujson
  Running setup.py bdist_wheel for dill ... done
  Stored in directory: /home/ichn/.cache/pip/wheels/f6/d1/a7/c90dbb9c5613295c70d96d60e78c3e2b283143fddbbd57e14d
  Running setup.py bdist_wheel for jsondiff ... done
  Stored in directory: /home/ichn/.cache/pip/wheels/8f/c9/36/f9e8aea16af567ce91abbe6b8b6b650877b9e17ce8aa97fb42
  Running setup.py bdist_wheel for PyYAML ... done
  Stored in directory: /home/ichn/.cache/pip/wheels/11/c5/f8/4e054145468ca00fd2ab4a6c20bf7e09ec57b879572c865ee6
  Running setup.py bdist_wheel for wrapt ... done
  Stored in directory: /home/ichn/.cache/pip/wheels/10/a6/59/eab55ff1e60d10ca0404baf6e7b8baf52908091133608bf289
  Running setup.py bdist_wheel for cytoolz ... error
  Complete output from command /home/ichn/anaconda3/envs/torch/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-grk1qblh/cytoolz/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-ip_zgny_ --python-tag cp37:
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-3.7
  creating build/lib.linux-x86_64-3.7/cytoolz
  copying cytoolz/_signatures.py -> build/lib.linux-x86_64-3.7/cytoolz
  copying cytoolz/__init__.py -> build/lib.linux-x86_64-3.7/cytoolz
  copying cytoolz/utils_test.py -> build/lib.linux-x86_64-3.7/cytoolz
  copying cytoolz/_version.py -> build/lib.linux-x86_64-3.7/cytoolz
  copying cytoolz/compatibility.py -> build/lib.linux-x86_64-3.7/cytoolz
  creating build/lib.linux-x86_64-3.7/cytoolz/curried
  copying cytoolz/curried/operator.py -> build/lib.linux-x86_64-3.7/cytoolz/curried
  copying cytoolz/curried/__init__.py -> build/lib.linux-x86_64-3.7/cytoolz/curried
  copying cytoolz/curried/exceptions.py -> build/lib.linux-x86_64-3.7/cytoolz/curried
  copying cytoolz/dicttoolz.pyx -> build/lib.linux-x86_64-3.7/cytoolz
  copying cytoolz/itertoolz.pyx -> build/lib.linux-x86_64-3.7/cytoolz
  copying cytoolz/utils.pyx -> build/lib.linux-x86_64-3.7/cytoolz
  copying cytoolz/recipes.pyx -> build/lib.linux-x86_64-3.7/cytoolz
  copying cytoolz/functoolz.pyx -> build/lib.linux-x86_64-3.7/cytoolz
  copying cytoolz/dicttoolz.pxd -> build/lib.linux-x86_64-3.7/cytoolz
  copying cytoolz/__init__.pxd -> build/lib.linux-x86_64-3.7/cytoolz
  copying cytoolz/recipes.pxd -> build/lib.linux-x86_64-3.7/cytoolz
  copying cytoolz/utils.pxd -> build/lib.linux-x86_64-3.7/cytoolz
  copying cytoolz/functoolz.pxd -> build/lib.linux-x86_64-3.7/cytoolz
  copying cytoolz/itertoolz.pxd -> build/lib.linux-x86_64-3.7/cytoolz
  copying cytoolz/cpython.pxd -> build/lib.linux-x86_64-3.7/cytoolz
  creating build/lib.linux-x86_64-3.7/cytoolz/tests
  copying cytoolz/tests/test_none_safe.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
  copying cytoolz/tests/test_recipes.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
  copying cytoolz/tests/test_curried.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
  copying cytoolz/tests/test_tlz.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
  copying cytoolz/tests/test_itertoolz.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
  copying cytoolz/tests/test_functoolz.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
  copying cytoolz/tests/dev_skip_test.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
  copying cytoolz/tests/test_embedded_sigs.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
  copying cytoolz/tests/test_utils.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
  copying cytoolz/tests/test_docstrings.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
  copying cytoolz/tests/test_inspect_args.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
  copying cytoolz/tests/test_doctests.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
  copying cytoolz/tests/test_curried_toolzlike.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
  copying cytoolz/tests/test_serialization.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
  copying cytoolz/tests/test_compatibility.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
  copying cytoolz/tests/test_signatures.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
  copying cytoolz/tests/test_dev_skip_test.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
  copying cytoolz/tests/test_dicttoolz.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
  running build_ext
  building 'cytoolz.dicttoolz' extension
  creating build/temp.linux-x86_64-3.7
  creating build/temp.linux-x86_64-3.7/cytoolz
  gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c cytoolz/dicttoolz.c -o build/temp.linux-x86_64-3.7/cytoolz/dicttoolz.o
  gcc -pthread -shared -B /home/ichn/anaconda3/envs/torch/compiler_compat -L/home/ichn/anaconda3/envs/torch/lib -Wl,-rpath=/home/ichn/anaconda3/envs/torch/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/cytoolz/dicttoolz.o -o build/lib.linux-x86_64-3.7/cytoolz/dicttoolz.cpython-37m-x86_64-linux-gnu.so
  /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/cytoolz/dicttoolz.o: unable to initialize decompress status for section .debug_info
  /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/cytoolz/dicttoolz.o: unable to initialize decompress status for section .debug_info
  /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/cytoolz/dicttoolz.o: unable to initialize decompress status for section .debug_info
  /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/cytoolz/dicttoolz.o: unable to initialize decompress status for section .debug_info
  build/temp.linux-x86_64-3.7/cytoolz/dicttoolz.o: file not recognized: file format not recognized
  collect2: error: ld returned 1 exit status
  error: command 'gcc' failed with exit status 1
  
  ----------------------------------------
  Failed building wheel for cytoolz
  Running setup.py clean for cytoolz
  Running setup.py bdist_wheel for future ... done
  Stored in directory: /home/ichn/.cache/pip/wheels/3f/66/fe/9c4fd5c707a9f26993ba157f0752d84d5c7e26aedbefb84f76
  Running setup.py bdist_wheel for toolz ... done
  Stored in directory: /home/ichn/.cache/pip/wheels/73/ad/e1/f8fe78eeb9e2b31ea8396419d92adc107c553ff7eb47ad12d9
Successfully built overrides nltk parsimonious numpydoc msgpack dill jsondiff PyYAML wrapt future toolz
Failed to build jsonnet regex ujson cytoolz
Installing collected packages: overrides, jsonnet, wcwidth, ftfy, singledispatch, nltk, regex, murmurhash, ujson, cymem, dill, idna, chardet, urllib3, requests, msgpack, msgpack-numpy, tqdm, wrapt, preshed, toolz, cytoolz, plac, thinc, spacy, sqlparse, itsdangerous, click, werkzeug, MarkupSafe, Jinja2, flask, flask-cors, editdistance, flaky, cycler, pytz, kiwisolver, python-dateutil, pyparsing, matplotlib, greenlet, gevent, atomicwrites, py, pluggy, attrs, more-itertools, pytest, responses, jsonpickle, aws-xray-sdk, cookies, xmltodict, pbr, mock, jsondiff, jmespath, docutils, botocore, s3transfer, boto3, PyYAML, pyaml, boto, websocket-client, docker-pycreds, docker, asn1crypto, cryptography, future, ecdsa, pycryptodome, python-jose, moto, parsimonious, Pygments, alabaster, sphinxcontrib-websupport, imagesize, snowballstemmer, babel, packaging, sphinx, numpydoc, protobuf, tensorboardX, h5py, conllu, scipy, scikit-learn, unidecode, pytorch-pretrained-bert, colorama, pyasn1, rsa, awscli, allennlp
  Running setup.py install for jsonnet ... error
    Complete output from command /home/ichn/anaconda3/envs/torch/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-grk1qblh/jsonnet/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-3ccckdjt/install-record.txt --single-version-externally-managed --compile:
    running install
    running build
    running build_ext
    make: 'core/desugarer.o' is up to date.
    make: 'core/formatter.o' is up to date.
    make: 'core/libjsonnet.o' is up to date.
    make: 'core/lexer.o' is up to date.
    make: 'core/parser.o' is up to date.
    make: 'core/pass.o' is up to date.
    make: 'core/static_analysis.o' is up to date.
    make: 'core/string_utils.o' is up to date.
    make: 'core/vm.o' is up to date.
    make: 'third_party/md5/md5.o' is up to date.
    building '_jsonnet' extension
    creating build
    creating build/temp.linux-x86_64-3.7
    creating build/temp.linux-x86_64-3.7/python
    gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Iinclude -Ithird_party/md5 -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c python/_jsonnet.c -o build/temp.linux-x86_64-3.7/python/_jsonnet.o
    python/_jsonnet.c: In function ‘cpython_native_callback’:
    python/_jsonnet.c:147:19: warning: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘const long unsigned int’} [-Wsign-compare]
         for (i = 0; i < ctx->argc; ++i) {
                       ^
    creating build/lib.linux-x86_64-3.7
    g++ -pthread -shared -B /home/ichn/anaconda3/envs/torch/compiler_compat -L/home/ichn/anaconda3/envs/torch/lib -Wl,-rpath=/home/ichn/anaconda3/envs/torch/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/python/_jsonnet.o core/desugarer.o core/formatter.o core/libjsonnet.o core/lexer.o core/parser.o core/pass.o core/static_analysis.o core/string_utils.o core/vm.o third_party/md5/md5.o -o build/lib.linux-x86_64-3.7/_jsonnet.cpython-37m-x86_64-linux-gnu.so
    /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/python/_jsonnet.o: unable to initialize decompress status for section .debug_info
    /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/python/_jsonnet.o: unable to initialize decompress status for section .debug_info
    /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/python/_jsonnet.o: unable to initialize decompress status for section .debug_info
    /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/python/_jsonnet.o: unable to initialize decompress status for section .debug_info
    build/temp.linux-x86_64-3.7/python/_jsonnet.o: file not recognized: file format not recognized
    collect2: error: ld returned 1 exit status
    error: command 'g++' failed with exit status 1
    
    ----------------------------------------
Command "/home/ichn/anaconda3/envs/torch/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-grk1qblh/jsonnet/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-3ccckdjt/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-grk1qblh/jsonnet/

To Reproduce

I am using conda 5.3.1 under archlinux with pytorch==1.0.0 preinstalled as part of the environment, and gcc of version

(torch) ➜  ~ g++ --version
g++ (GCC) 8.2.1 20181127
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

also tried install gcc through conda and then rerun

pip install allennlp

but fails as well.

Expected behavior

allennlp successfully installed

System (please complete the following information):

Linux
Python version: 3.7.2
AllenNLP version: I installed from master
PyTorch version: 1.0.0

opened by ichn-hu 32

Multi-GPU training hangs

Checklist

[x] I have verified that the issue exists against the main branch of AllenNLP.
[x] I have read the relevant section in the contribution guide on reporting bugs.
[x] I have checked the issues list for similar or identical bug reports.
[x] I have checked the pull requests list for existing proposed fixes.
[x] I have checked the CHANGELOG and the commit log to find out if the bug was already fixed in the main branch.
[x] I have included in the "Description" section below a traceback from any exceptions related to this bug.
[x] I have included in the "Related issues or possible duplicates" section beloew all related issues and possible duplicate issues (If there are none, check this box anyway).
[x] I have included in the "Environment" section below the name of the operating system and Python version that I was using when I discovered this bug.
[x] I have included in the "Environment" section below the output of pip freeze.
[x] I have included in the "Steps to reproduce" section below a minimally reproducible example.

Description

I am trying to run multi-GPU training (using 4 GPUs) but it hangs after a few iterations (roughly 15 iterations). This happens both with my custom model as well as with models in allennlp-models (I tried roberta-large).

Related issues or possible duplicates

None

Environment

OS: Deep Learning AMI (Ubuntu 18.04) Version 42.1 -- AWS EC2 p3.8xlarge

Python version: Python 3.8 installed via Anaconda

Steps to reproduce

I have installed allennlp-models and changed the configuration file reported above as follows:

local transformer_model = "roberta-base";
local transformer_dim = 768;

{
  "dataset_reader":{
    "type": "boolq",
    "token_indexers": {
      "tokens": {
        "type": "pretrained_transformer",
        "model_name": transformer_model,
      }
    },
    "tokenizer": {
      "type": "pretrained_transformer",
      "model_name": transformer_model,
    }
  },
  "train_data_path": "https://storage.googleapis.com/allennlp-public-data/BoolQ.zip!BoolQ/train.jsonl",
  "validation_data_path": "https://storage.googleapis.com/allennlp-public-data/BoolQ.zip!BoolQ/val.jsonl",
  "test_data_path": "https://storage.googleapis.com/allennlp-public-data/BoolQ.zip!BoolQ/test.jsonl",
  "model": {
    "type": "basic_classifier",
    "text_field_embedder": {
      "token_embedders": {
        "tokens": {
          "type": "pretrained_transformer",
          "model_name": transformer_model,
        }
      }
    },
    "seq2vec_encoder": {
       "type": "bert_pooler",
       "pretrained_model": transformer_model,
       "dropout": 0.1,
    },
    "namespace": "tags",
    "num_labels": 2,
  },
  "data_loader": {
    "batch_sampler": {
      "type": "bucket",
      "sorting_keys": ["tokens"],
      "batch_size" : 4
    }
  },
  "distributed": {
      "cuda_devices": [0,1,2,3]
  },
  "trainer": {
    "num_epochs": 10,
    "num_gradient_accumulation_steps": 2,
    "validation_metric": "+accuracy",
    "learning_rate_scheduler": {
      "type": "slanted_triangular",
      "num_epochs": 10,
      "num_steps_per_epoch": 3088,
      "cut_frac": 0.06
    },
    "optimizer": {
      "type": "huggingface_adamw",
      "lr": 1e-5,
      "weight_decay": 0.1,
    }
  },
}

@epwalsh Any ideas?

bug

opened by aleSuglia 31

[Suggestion] Callbacks in Trainer
Hello,

First off, thank you so much for creating and maintaining this amazing library! I'm really loving allennlp so far, but there's one aspect that I think could use some improvements: the Trainer. I think the adoption of Callbacks would make the code much more readable, maintainable, and extendable.

Is your feature request related to a problem? Please describe.

The Trainer code feels bloated and is hard to navigate due to the sheer amount of bookkeeping that is taking place regarding tensorboard, checkpointing, etc.

Adding extra behavior to the trainer (e.g. custom logging, adding semi-supervised steps, performing custom sanity checks during training) requires the modification of the training code. It would be better if the user could simply inject this behavior using an external class

Describe the solution you'd like Frameworks like keras and fast.ai have adopted the Callback pattern to deal with this problem. In fact, checkpointing and tensorboard were both callbacks in keras. Refactoring the trainer to adopt the callback system for a lot of the bookkeeping would make custom training behavior easier to implement.

Describe alternatives you've considered I can't think of any better alternatives at the moment: the fact that both keras and fast.ai which put heavy emphasis on the end user experience adopt this pattern seem to be a strong indication of its merits. That being said, I would imagine the allennlp team would have considered callbacks at some time, so if there is a reason you are avoiding this pattern, I would love to know why!

If Callbacks seem like a good addition, I'd love to file a PR on this issue, but since this would be a major refactoring I won't be able to finish it any time soon. Thanks!
opened by keitakurita 31
CVE-2007-4559 Patch

Patching CVE-2007-4559

Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

opened by TrellixVulnTeam 1

AllenNLP biased towards BERT

Checklist

[ x ] I have verified that the issue exists against the main branch of AllenNLP.
[ x ] I have read the relevant section in the contribution guide on reporting bugs.
[ x ] I have checked the issues list for similar or identical bug reports.
[ x ] I have checked the pull requests list for existing proposed fixes.
[ x ] I have checked the CHANGELOG and the commit log to find out if the bug was already fixed in the main branch.
[ ] I have included in the "Description" section below a traceback from any exceptions related to this bug.
[ x ] I have included in the "Related issues or possible duplicates" section below all related issues and possible duplicate issues (If there are none, check this box anyway).
[ x ] I have included in the "Environment" section below the name of the operating system and Python version that I was using when I discovered this bug.
[ x ] I have included in the "Environment" section below the output of pip freeze.
[ x ] I have included in the "Steps to reproduce" section below a minimally reproducible example.

Description

I've started using AllenNLP since 2018, and I have already run thousands of NER benchmarks with it...since ELMo, and following with transformers, it's CrfTagger model has always yielded superior results in every possible benchmark for this task. However, since my research group trained different RoBERTa models for Portuguese, we have been conducting benchmarks comparing them with an existing BERT model, but we have been getting inconsistent results compared to other frameworks, such as huggingface's transformers.

Sorted results for AllenNLP grid search on CoNLL2003 using optuna (all berts' results are better than all the robertas'): Sorted results for huggingface's transformers grid search on CoNLL2003 (all robertas' results are better than all the berts'):

I originally opened this as a question on stackoverflow, as suggested in the issues guidelines (additional details already provided there), but I have failed to discover the problem by myself. I have run several unit tests from AllenNLP, concerning the tokenizers and embedders, and couldn't notice anything wrong, but I'm betting something is definetely wrong in the training process, since the results are so inferior for non-BERT models.

Although I'm reporting details with the current release version, I'd like to point out that I had already run this CoNLL 2003 benchmark with RoBERTa/AllenNLP a long time ago too, so it's not something new. At the time the results for RoBERTa were quite below bert-base, but at the time I just thought RoBERTa wasn't competitive for NER (which is not true at all).

It is expected that the results using AllenNLP are at least as good as the ones obtained using huggingface's framework.

Related issues or possible duplicates

Opened as a question myself

Environment

OS: Linux

Python version: 3.8.13

Output of pip freeze:

aiohttp==3.8.1
aiosignal==1.2.0
alembic==1.8.1
allennlp==2.10.0
allennlp-models==2.10.0
allennlp-optuna==0.1.7
asttokens==2.0.8
async-timeout==4.0.2
attrs==21.2.0
autopage==0.5.1
backcall==0.2.0
base58==2.1.1
blis==0.7.8
bokeh==2.4.3
boto3==1.24.67
botocore==1.27.67
cached-path==1.1.5
cachetools==5.2.0
catalogue==2.0.8
certifi @ file:///opt/conda/conda-bld/certifi_1655968806487/work/certifi
charset-normalizer==2.1.1
click==8.1.3
cliff==4.0.0
cloudpickle==2.2.0
cmaes==0.8.2
cmd2==2.4.2
colorama==0.4.5
colorlog==6.7.0
commonmark==0.9.1
conllu==4.4.2
converters-datalawyer==0.1.10
cvxopt==1.2.7
cvxpy==1.2.1
cycler==0.11.0
cymem==2.0.6
Cython==0.29.32
datasets==2.4.0
debugpy==1.6.3
decorator==5.1.1
deprecation==2.1.0
dill==0.3.5.1
dkpro-cassis==0.7.2
docker-pycreds==0.4.0
ecos==2.0.10
elasticsearch==7.13.0
emoji==2.0.0
en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.3.0/en_core_web_sm-3.3.0-py3-none-any.whl
entrypoints==0.4
executing==1.0.0
fairscale==0.4.6
filelock==3.7.1
fire==0.4.0
fonttools==4.37.1
frozenlist==1.3.1
fsspec==2022.8.2
ftfy==6.1.1
future==0.18.2
gensim==4.2.0
gitdb==4.0.9
GitPython==3.1.27
google-api-core==2.8.2
google-auth==2.11.0
google-cloud-core==2.3.2
google-cloud-storage==2.5.0
google-crc32c==1.5.0
google-resumable-media==2.3.3
googleapis-common-protos==1.56.4
greenlet==1.1.3
h5py==3.7.0
hdbscan==0.8.28
huggingface-hub==0.8.1
hyperopt==0.2.7
idna==3.3
importlib-metadata==4.12.0
importlib-resources==5.4.0
inceptalytics==0.1.0
iniconfig==1.1.1
ipykernel==6.15.2
ipython==8.5.0
jedi==0.18.1
Jinja2==3.1.2
jmespath==1.0.1
joblib==1.1.0
jsonnet==0.18.0
jupyter-core==4.11.1
jupyter_client==7.3.5
kiwisolver==1.4.4
krippendorff==0.5.1
langcodes==3.3.0
llvmlite==0.39.1
lmdb==1.3.0
lxml==4.9.1
Mako==1.2.2
MarkupSafe==2.1.1
matplotlib==3.5.3
matplotlib-inline==0.1.6
more-itertools==8.12.0
multidict==6.0.2
multiprocess==0.70.13
murmurhash==1.0.8
nest-asyncio==1.5.5
networkx==2.8.6
nltk==3.7
numba==0.56.2
numpy==1.23.3
optuna==2.10.1
osqp==0.6.2.post5
overrides==6.2.0
packaging==21.3
pandas==1.4.4
parso==0.8.3
pathtools==0.1.2
pathy==0.6.2
pbr==5.10.0
pexpect==4.8.0
pickleshare==0.7.5
Pillow==9.2.0
pluggy==1.0.0
preshed==3.0.7
prettytable==3.4.1
promise==2.3
prompt-toolkit==3.0.31
protobuf==3.20.0
psutil==5.9.2
pt-core-news-sm @ https://github.com/explosion/spacy-models/releases/download/pt_core_news_sm-3.3.0/pt_core_news_sm-3.3.0-py3-none-any.whl
ptyprocess==0.7.0
pure-eval==0.2.2
py==1.11.0
py-rouge==1.1
py4j==0.10.9.7
pyannote.core==4.5
pyannote.database==4.1.3
pyarrow==9.0.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycaprio==0.2.1
pydantic==1.8.2
pygamma-agreement==0.5.6
Pygments==2.13.0
pympi-ling==1.70.2
pyparsing==3.0.9
pyperclip==1.8.2
pytest==7.1.3
python-dateutil==2.8.2
pytz==2022.2.1
PyYAML==6.0
pyzmq==23.2.1
qdldl==0.1.5.post2
regex==2022.8.17
requests==2.28.1
requests-toolbelt==0.9.1
responses==0.18.0
rich==12.1.0
rsa==4.9
s3transfer==0.6.0
sacremoses==0.0.53
scikit-learn==1.1.2
scipy==1.9.1
scs==3.2.0
seaborn==0.12.0
sentence-transformers==2.2.2
sentencepiece==0.1.97
sentry-sdk==1.9.8
seqeval==1.2.2
setproctitle==1.3.2
shellingham==1.5.0
shortuuid==1.0.9
simplejson==3.17.6
six==1.16.0
sklearn==0.0
smart-open==5.2.1
smmap==5.0.0
sortedcontainers==2.4.0
spacy==3.3.1
spacy-legacy==3.0.10
spacy-loggers==1.0.3
split-datalawyer==0.1.80
SQLAlchemy==1.4.41
srsly==2.4.4
stack-data==0.5.0
stanza==1.4.0
stevedore==4.0.0
tensorboardX==2.5.1
termcolor==1.1.0
TextGrid==1.5
thinc==8.0.17
threadpoolctl==3.1.0
tokenizers==0.12.1
tomli==2.0.1
toposort==1.7
torch==1.13.0.dev20220911+cu117
torchvision==0.14.0.dev20220911+cu117
tornado==6.2
tqdm==4.64.1
traitlets==5.3.0
transformers==4.21.3
typer==0.4.2
typing_extensions==4.3.0
umap==0.1.1
Unidecode==1.3.4
urllib3==1.26.12
wandb==0.12.21
wasabi==0.10.1
wcwidth==0.2.5
word2number==1.1
xxhash==3.0.0
yarl==1.8.1
zipp==3.8.1

Steps to reproduce

I'm attaching some parameters I used for running the CoNLL 2003 grid search.

Example source:

export BATCH_SIZE=8
export EPOCHS=10
export gradient_accumulation_steps=4
export dropout=0.2
export weight_decay=0
export seed=42

allennlp tune \
    optuna_conll2003.jsonnet \
    optuna-grid-search-conll2003-hparams.json \
    --optuna-param-path optuna-grid-search-conll2003.json \
    --serialization-dir /models/conll2003/benchmark_allennlp \
    --study-name benchmark-allennlp-models-conll2003 \
    --metrics test_f1-measure-overall \
    --direction maximize \
    --skip-if-exists \
    --n-trials $1

optuna_conll2003.jsonnet optuna-grid-search-conll2003.json optuna-grid-search-conll2003-hparams.json

bug Under Development question

opened by pvcastro 12

Computing rouge is significantly slow

Checklist

[x] I have verified that the issue exists against the main branch of AllenNLP.
[x] I have read the relevant section in the contribution guide on reporting bugs.
[x] I have checked the issues list for similar or identical bug reports.
[x] I have checked the pull requests list for existing proposed fixes.
[x] I have checked the CHANGELOG and the commit log to find out if the bug was already fixed in the main branch.
[x] I have included in the "Description" section below a traceback from any exceptions related to this bug.
[x] I have included in the "Related issues or possible duplicates" section beloew all related issues and possible duplicate issues (If there are none, check this box anyway).
[x] I have included in the "Environment" section below the name of the operating system and Python version that I was using when I discovered this bug.
[x] I have included in the "Environment" section below the output of pip freeze.
[x] I have included in the "Steps to reproduce" section below a minimally reproducible example.

Description

The rouge metric computation used in the metric is extremely slow. I am testing this using the Bart model.

Particularly the call here: https://github.com/allenai/allennlp-models/blob/3e3b3ecf8531d8c4d900fdf616926426b401b9ee/allennlp_models/generation/models/bart.py#L260

Related issues or possible duplicates

None

Environment

OS: Linux

Python version: 3.9

Output of pip freeze:

absl-py==1.2.0
aiohttp @ file:///home/void/.cache/pypoetry/artifacts/c3/d1/97/85daa5493c9bf3eb25c509bc3eb629985f82b89336bfcb5adcb76b78fe/aiohttp-3.8.1-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
aiosignal @ file:///home/void/.cache/pypoetry/artifacts/d3/d7/92/f7aa28b137b4b00e62270f58330cedde299917a71a3c00ee64913aa3c2/aiosignal-1.2.0-py3-none-any.whl
alabaster @ file:///home/void/.cache/pypoetry/artifacts/48/57/b7/0e44080d4c6f80ece83e427f90368b376fdec42759b60c2c37a04da23b/alabaster-0.7.12-py2.py3-none-any.whl
allennlp @ file:///home/void/.cache/pypoetry/artifacts/14/4f/13/65c796d48a7ad702aa08a9b49c5535ad1b20033bb2c1b6928322c552dd/allennlp-2.9.2-py3-none-any.whl
allennlp-models @ file:///home/void/.cache/pypoetry/artifacts/47/88/8b/4a3be6913f179f5ff20fa64ea5ec9878fe3b0689b1c63b1e63896f3eee/allennlp_models-2.9.0-py3-none-any.whl
arger @ file:///home/void/.cache/pypoetry/artifacts/be/a9/97/92eedbef45f7661c903a6a82517d2d19b7b808717bee4ca85c941531b0/arger-1.4.2-py3-none-any.whl
astor @ file:///home/void/.cache/pypoetry/artifacts/1f/03/36/982d1222edac5e8cb8ac6e0464249747fa800d4fb04728a99153ecfe4d/astor-0.8.1-py2.py3-none-any.whl
async-timeout @ file:///home/void/.cache/pypoetry/artifacts/62/ff/30/bd0ec891880d93bf40cf6e7c53ab08dd9eb6af45fa532c030731e6eed0/async_timeout-4.0.2-py3-none-any.whl
attrs @ file:///home/void/.cache/pypoetry/artifacts/5b/25/94/5d165cd3190cb7dfdc98de86c28e8131fc6e19d83a504412664abffec0/attrs-21.4.0-py2.py3-none-any.whl
autorepr @ file:///home/void/.cache/pypoetry/artifacts/79/b0/2c/559adfad5f45b74ef658fdab7bff52f955339a62d1aeae798f390f88d9/autorepr-0.3.0-py2.py3-none-any.whl
Babel @ file:///home/void/.cache/pypoetry/artifacts/1e/28/b6/110e8fc8ccc0dafe21bbc521139d0d0674574f7daa6f6f90137bdb4759/Babel-2.9.1-py2.py3-none-any.whl
bandit @ file:///home/void/.cache/pypoetry/artifacts/fa/13/c2/a232fa22fd0f429ae8209f7cbe0ad0e0a1a2f97bb3f99d384b58645316/bandit-1.7.2-py3-none-any.whl
base58 @ file:///home/void/.cache/pypoetry/artifacts/2e/53/33/a15f42f485704caf86395e9701bb3440206ec9bffc062d7a597f43518e/base58-2.1.1-py3-none-any.whl
blis @ file:///home/void/.cache/pypoetry/artifacts/12/c4/39/2501d7749523f2dc3dd58b05c65b7a798a500443434e86cda77c140018/blis-0.7.7-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
boto3 @ file:///home/void/.cache/pypoetry/artifacts/d9/a2/a4/da57f98666cc512046665f06c99f53f8da4601ec7c2ce17beccf77b7f7/boto3-1.21.32-py3-none-any.whl
botocore @ file:///home/void/.cache/pypoetry/artifacts/34/6a/fa/87cdb7201212caae3ac92915f97d2325964115043d1294f43b1596ba49/botocore-1.24.32-py3-none-any.whl
cached-path @ file:///home/void/.cache/pypoetry/artifacts/5d/51/03/21b1b3459165f5c7b4d2f598288cc08342351bf505742c829444521529/cached_path-1.1.1-py3-none-any.whl
cachetools @ file:///home/void/.cache/pypoetry/artifacts/8a/c3/ee/4c03d5398d7858010aa0573d8e7ef7d1ff2a39ee05b48344ca931bb3a4/cachetools-5.0.0-py3-none-any.whl
cachy @ file:///home/void/.cache/pypoetry/artifacts/c0/ce/2b/be65c61ed593659749cff39804b346369e358f14f09151abe591f0c5ab/cachy-0.3.0-py2.py3-none-any.whl
catalogue @ file:///home/void/.cache/pypoetry/artifacts/d0/e5/e8/780e087ffacede01d2f7e1980f3186ca0bfc77132a3ad172b84461810c/catalogue-2.0.7-py3-none-any.whl
certifi @ file:///home/void/.cache/pypoetry/artifacts/71/9a/ba/a51b34ce9aacf9ac5dbb90d7c7335877522ee188189d9a521ee1a9c411/certifi-2021.10.8-py2.py3-none-any.whl
charset-normalizer @ file:///home/void/.cache/pypoetry/artifacts/2b/25/ee/fa64b49cb44c9c392acf26e69ee8314d3b386932de86a9945f4ad2e633/charset_normalizer-2.0.12-py3-none-any.whl
click @ file:///home/void/.cache/pypoetry/artifacts/c1/84/b3/e7705da6adc67a18c1140d92bf8b4477ee2308a09e5eabd7fedba9bec6/click-8.1.2-py3-none-any.whl
ConfigUpdater @ file:///home/void/.cache/pypoetry/artifacts/40/87/c8/f412301c7171b20fe38c61640da4e7ace132fe8ae24cf0850f49992eb5/ConfigUpdater-3.1-py2.py3-none-any.whl
conllu @ file:///home/void/.cache/pypoetry/artifacts/db/ca/f6/bbbd41f4db6714a57bd002c4edfa9d92819e40c52171de18acb5eb4b8f/conllu-4.4.1-py2.py3-none-any.whl
coverage @ file:///home/void/.cache/pypoetry/artifacts/67/c6/b5/6278ee4c01ddec492ba84935b44db1898a08d44567af67faf8787802ca/coverage-6.3.2-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
cymem @ file:///home/void/.cache/pypoetry/artifacts/f1/e5/0d/418ca076950a2a7841dd4199fe239ce71934f6eed6448b117560c8b594/cymem-2.0.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
darglint @ file:///home/void/.cache/pypoetry/artifacts/2d/47/f9/e4df12fe8185cdb64f2f77c23d349693941e09d8d5374c68454fb12e4f/darglint-1.8.1-py3-none-any.whl
datasets @ file:///home/void/.cache/pypoetry/artifacts/a5/d8/a3/8f9ef48effda4e8665e0f8e31b35fd8e5c55084ea188a49d3002240354/datasets-2.0.0-py3-none-any.whl
dictdiffer @ file:///home/void/.cache/pypoetry/artifacts/bc/85/a4/4043bd24ee0498bfa674ee88c3b151fb29f0946a2b17b0f3e5f02c906b/dictdiffer-0.9.0-py2.py3-none-any.whl
dill @ file:///home/void/.cache/pypoetry/artifacts/26/56/9e/73963d2285e6c700801f185e8c1d28f1f971c09aaa411cec9b799a5fca/dill-0.3.4-py2.py3-none-any.whl
doc8 @ file:///home/void/.cache/pypoetry/artifacts/02/8c/ea/ee72e46c18bb92524f64422cb67e575cca213c43a4c48411b1bac48dd3/doc8-0.10.1-py3-none-any.whl
docker-pycreds @ file:///home/void/.cache/pypoetry/artifacts/9b/0b/be/891931da9caf5e55102337a635d3a7eeeb92c93b4bd39c24d0810f1f25/docker_pycreds-0.4.0-py2.py3-none-any.whl
docutils @ file:///home/void/.cache/pypoetry/artifacts/43/28/92/79000933ad30371dc938d9b368a9000e20ac0bb467a716c19ef1fbd3c7/docutils-0.17.1-py2.py3-none-any.whl
dparse @ file:///home/void/.cache/pypoetry/artifacts/0e/d3/3f/27b08502c8c8da888ca1b2701c4f1d91b6777be03197cc2566b65eb3dd/dparse-0.5.1-py3-none-any.whl
dpath @ file:///home/void/.cache/pypoetry/artifacts/7c/ba/dd/0b2e0c72068f1f80e3050cbc8ccb9c336a39faea2a4b0659be32e74e21/dpath-2.0.6-py3-none-any.whl
enum-compat @ file:///home/void/.cache/pypoetry/artifacts/67/87/53/4895e23520ad7f668ab9a1dacbaf6aabe5716c52f8bde75a5c4219ce20/enum_compat-0.0.3-py3-none-any.whl
eradicate @ file:///home/void/.cache/pypoetry/artifacts/12/ce/ac/197035fe6d51568abb7ea160f5ad416d2164a2010005e8356b8229e550/eradicate-2.0.0.tar.gz
fairscale @ file:///home/void/.cache/pypoetry/artifacts/e8/28/f6/20d1265c725b0b2b366bd8997d1875bbe144b263a8dd3269f7338f8bc4/fairscale-0.4.6.tar.gz
filelock @ file:///home/void/.cache/pypoetry/artifacts/72/44/bd/cade636c68ae8857ae195402d49def82f65a669e0fcf4e2dbd51ee6e7d/filelock-3.6.0-py3-none-any.whl
flake8 @ file:///home/void/.cache/pypoetry/artifacts/99/5c/01/a49c075bc6dead5c60588c1cd1701f63d614740792c9a09fb133b41d58/flake8-4.0.1-py2.py3-none-any.whl
flake8-bandit @ file:///home/void/.cache/pypoetry/artifacts/7e/e4/46/e15782d941f9cde39b64ca5b636180f47573f2b2c9315be56b55152f17/flake8_bandit-2.1.2.tar.gz
flake8-broken-line @ file:///home/void/.cache/pypoetry/artifacts/a3/9f/36/0224d46b3050800707535f9c03fee2b60fcc1b48f3c4b419d2cd1bcbd0/flake8_broken_line-0.4.0-py3-none-any.whl
flake8-bugbear @ file:///home/void/.cache/pypoetry/artifacts/70/6f/8b/dba01c4ea9cacb2c31ffb4cb9f030f9d0f3ae5f864bfc703eed10f37b4/flake8_bugbear-22.3.23-py3-none-any.whl
flake8-commas @ file:///home/void/.cache/pypoetry/artifacts/32/06/3d/62f07a797bd4a6d9f15ead0b6306a47bd0650ca9b3e20aa08d4ee2c23d/flake8_commas-2.1.0-py2.py3-none-any.whl
flake8-comprehensions @ file:///home/void/.cache/pypoetry/artifacts/8a/f0/bd/c5b42fff0dce6be658153f234d81795c2344f5a2c4f69e1c7e6e9736d8/flake8_comprehensions-3.8.0-py3-none-any.whl
flake8-debugger @ file:///home/void/.cache/pypoetry/artifacts/66/04/47/7bef98a8d237eb17cbfbcb803343be1c79e2c0674ceba163717b6c8e1b/flake8_debugger-4.0.0-py3-none-any.whl
flake8-docstrings @ file:///home/void/.cache/pypoetry/artifacts/e0/85/e9/6b482a11d48cf26e1170d9f5bf0b044a5a6c9b816ffe70945e90fc3e56/flake8_docstrings-1.6.0-py2.py3-none-any.whl
flake8-eradicate @ file:///home/void/.cache/pypoetry/artifacts/94/a4/1b/56cd39944d5d4d2639e80929837e7fc3b2ff432e095fd7a4d528c56090/flake8_eradicate-1.2.0-py3-none-any.whl
flake8-isort @ file:///home/void/.cache/pypoetry/artifacts/73/53/0f/7e71fed3a8631dfabd048e16a84f9fb8e15c8c9a381bda2a444bf42961/flake8_isort-4.1.1-py3-none-any.whl
flake8-plugin-utils @ file:///home/void/.cache/pypoetry/artifacts/d1/c5/cd/002f81a0e6fe87b5dc718a0f7381e43f0814e72acccd04d84899cb342b/flake8_plugin_utils-1.3.2-py3-none-any.whl
flake8-polyfill @ file:///home/void/.cache/pypoetry/artifacts/28/17/cc/952c11cd5ffb2608137557f928dc4f9365b4dbe1e2a6015eeea78583ac/flake8_polyfill-1.0.2-py2.py3-none-any.whl
flake8-pytest-style @ file:///home/void/.cache/pypoetry/artifacts/da/07/65/f01e20adbfd9eacf37bea3f2a189a12bd2ad42eb8edad18b487e3c4af1/flake8_pytest_style-1.6.0-py3-none-any.whl
flake8-quotes @ file:///home/void/.cache/pypoetry/artifacts/76/04/0d/e3326c63986618bbd2e54c3211274f66a0bfae21c700e3e67ed14640cd/flake8-quotes-3.3.1.tar.gz
flake8-rst-docstrings @ file:///home/void/.cache/pypoetry/artifacts/07/75/20/16ffb132e46114df2fbe48fe2f50254b3f8ea6016371f04addc6738f41/flake8_rst_docstrings-0.2.5-py3-none-any.whl
flake8-string-format @ file:///home/void/.cache/pypoetry/artifacts/24/89/bb/7ce8e216f8c7289aa8a2ad4c44f30f87af6c7cdaf5d510110d566d66ec/flake8_string_format-0.3.0-py2.py3-none-any.whl
flatten-dict @ file:///home/void/.cache/pypoetry/artifacts/26/bb/15/9b7ce0da158ec2a7d5ea6bad65b1411625b78bdc4d4fadfee34f74c5a1/flatten_dict-0.4.2-py2.py3-none-any.whl
frozenlist @ file:///home/void/.cache/pypoetry/artifacts/ed/2c/87/e16164d5388ad4c9d0f19a5e24f8e5acf77ecf926aeb714dd7b8587f41/frozenlist-1.3.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
fsspec @ file:///home/void/.cache/pypoetry/artifacts/11/37/96/7d3584b00d04a83cdc1e638c7d71bda5569fdae7c5170930af2cc64b1a/fsspec-2022.3.0-py3-none-any.whl
ftfy @ file:///home/void/.cache/pypoetry/artifacts/74/bd/82/f784771e34d48a09065097c79b938950c8618665100a80d571b8927ebe/ftfy-6.1.1-py3-none-any.whl
future @ file:///home/void/.cache/pypoetry/artifacts/f8/58/55/86be1f567b212fdd98854d12815964a49db8fb1bcff725018e5f95c61d/future-0.18.2.tar.gz
gitdb @ file:///home/void/.cache/pypoetry/artifacts/90/22/b2/13e35fcefe45f6779d7507377070865acf263775bda5c08fff422c1a4d/gitdb-4.0.9-py3-none-any.whl
GitPython @ file:///home/void/.cache/pypoetry/artifacts/c0/0a/c6/d3f006e6bd78795a03581a408b47eed8e5839bd35849453d0b65c005d5/GitPython-3.1.27-py3-none-any.whl
google-api-core @ file:///home/void/.cache/pypoetry/artifacts/95/61/22/06e515ed29cde25f855277120efc36aed8d8c8b62bca201fb465362a3b/google_api_core-2.7.1-py3-none-any.whl
google-auth @ file:///home/void/.cache/pypoetry/artifacts/8f/f9/4c/c7e8674f2d6e0c58b86e733066dda815b8d7bad12f35a6642d23feaaf3/google_auth-2.6.2-py2.py3-none-any.whl
google-cloud-core @ file:///home/void/.cache/pypoetry/artifacts/1e/77/ed/a0d1979128eb7fcc25b759d57092dfd958a129b55d7df78167935e7bd2/google_cloud_core-2.2.3-py2.py3-none-any.whl
google-cloud-storage @ file:///home/void/.cache/pypoetry/artifacts/d5/b1/01/e11f0598ae945199dc42faf92b242ebb2285f0e1c6fd980ba9f0cf1a70/google_cloud_storage-2.2.1-py2.py3-none-any.whl
google-crc32c @ file:///home/void/.cache/pypoetry/artifacts/dc/73/5d/3e90ed7e68039176e9833fa7936e473a5542d1c38f19a3d63e42ef3e0a/google_crc32c-1.3.0-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
google-resumable-media @ file:///home/void/.cache/pypoetry/artifacts/39/3b/79/c4415ca7180677b2e557ed709639422e61f1b22966b311704abd49fef6/google_resumable_media-2.3.2-py2.py3-none-any.whl
googleapis-common-protos @ file:///home/void/.cache/pypoetry/artifacts/77/03/40/c9ccd6f364aadbc110d86bf95cda9a0bb13f5a2e4d9d56cb5882faddc3/googleapis_common_protos-1.56.0-py2.py3-none-any.whl
h5py @ file:///home/void/.cache/pypoetry/artifacts/f1/de/3b/b6b056d985b3b06bfcbb42eaa59871c2c47aae1687e406be20e01d1541/h5py-3.6.0-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
huggingface-hub @ file:///home/void/.cache/pypoetry/artifacts/24/8b/23/60f427cbd51495e2b236d213c7df704b754d21ff15a41f934a1ec82011/huggingface_hub-0.4.0-py3-none-any.whl
identify @ file:///home/void/.cache/pypoetry/artifacts/43/96/84/cf181c7ebd84e914f9a0b2aef0b19fd311d3fe841e7fe02a6b481ca1f2/identify-2.4.12-py2.py3-none-any.whl
idna @ file:///home/void/.cache/pypoetry/artifacts/d5/04/80/9c17fd3240a37d12ff2ef042b0306aeb1abd2d8b95f150fd60be938352/idna-3.3-py3-none-any.whl
imagesize @ file:///home/void/.cache/pypoetry/artifacts/96/5c/68/c505397f41dd445fe3305ed42a0c203fe0f36fee2cbb522ef564d90d71/imagesize-1.3.0-py2.py3-none-any.whl
importlib-metadata @ file:///home/void/.cache/pypoetry/artifacts/0e/bf/b8/5aa8328f4872f2f038b15e48abea8dbe20ba8b18cc0309ad17a127019e/importlib_metadata-4.11.3-py3-none-any.whl
iniconfig @ file:///home/void/.cache/pypoetry/artifacts/fa/b0/c6/10cfac68c9e6de9d2a1678366ca89fd9292b362c1760dbe758e41691cb/iniconfig-1.1.1-py2.py3-none-any.whl
isort @ file:///home/void/.cache/pypoetry/artifacts/9d/c0/d5/617ee6ec6b065b9a3c8c3f7a0b06604f1a47738904535923554e135a7d/isort-5.10.1-py3-none-any.whl
jedi @ file:///home/void/.cache/pypoetry/artifacts/c0/16/7b/b208472f00204d5aaeb0895fcde8e681c56c250bdf8d106fa76cdf7b30/jedi-0.18.1-py2.py3-none-any.whl
Jinja2 @ file:///home/void/.cache/pypoetry/artifacts/72/17/28/89e0501c998a9d7241d2c068d57e21d274caaa3ea52f338b31ba912c30/Jinja2-3.1.1-py3-none-any.whl
jmespath @ file:///home/void/.cache/pypoetry/artifacts/9b/2b/69/20e4f26420d69eb0e0a57f76a4bec6cb3d6747f30e744483165c06fd47/jmespath-1.0.0-py3-none-any.whl
joblib @ file:///home/void/.cache/pypoetry/artifacts/a6/ca/34/f3a58b34616787c399095242ae8633fc32e061f8f76debd987dcecb325/joblib-1.1.0-py2.py3-none-any.whl
jsonnet @ file:///home/void/.cache/pypoetry/artifacts/06/9f/87/1e3f68863f77661cd42ff5091e9f05b1ca810d715d2a5401c971fedba0/jsonnet-0.18.0.tar.gz
langcodes @ file:///home/void/.cache/pypoetry/artifacts/3c/ca/b7/0bd4d35b4be068935982424d5e77c73ead0ef8d16ec4d26e58c2fa7259/langcodes-3.3.0-py3-none-any.whl
lmdb @ file:///home/void/.cache/pypoetry/artifacts/88/f7/6f/368763b557f08f461e9011d87b2bda57d49dfb4bd2acb3846ce1693767/lmdb-1.3.0-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
loguru @ file:///home/void/.cache/pypoetry/artifacts/af/c5/2e/90fa5f415cade87a24f804aa29454a5263fa93efb5f9568878194ebad1/loguru-0.6.0-py3-none-any.whl
m2r2 @ file:///home/void/.cache/pypoetry/artifacts/13/22/0c/6840c0a42d0edbca824d90ebfdf23e3514d686ab1df51c393580ff1ca2/m2r2-0.3.2-py3-none-any.whl
MarkupSafe @ file:///home/void/.cache/pypoetry/artifacts/f4/6d/d3/a9a915049dea17a65026c9ac187154834bfe6c9174f0dd25e602f28c45/MarkupSafe-2.1.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
marshmallow @ file:///home/void/.cache/pypoetry/artifacts/b1/a8/f0/dc5ab8b9250d261e30d3dccf2712fdb1abdda469f9a8ed4b26317c411d/marshmallow-3.15.0-py3-none-any.whl
marshmallow-polyfield @ file:///home/void/.cache/pypoetry/artifacts/15/54/1a/bf7b323a7e9bef7044335503b044f792374a95673e74fcf13d80690a10/marshmallow_polyfield-5.10-py3-none-any.whl
mccabe @ file:///home/void/.cache/pypoetry/artifacts/96/5e/5f/21ae5296697ca7f94de4da6e21d4936d74029c352a35202e4c339a4253/mccabe-0.6.1-py2.py3-none-any.whl
mistune @ file:///home/void/.cache/pypoetry/artifacts/33/31/4c/2d69dc65d06d1c8f8b00b8e995e24bae97fce2e1f8ec5d8d2d98e852da/mistune-0.8.4-py2.py3-none-any.whl
more-itertools @ file:///home/void/.cache/pypoetry/artifacts/a6/67/cb/fbf4fb4bbfedae39080e351b12ca64287d47c839769eeaf91434f3274c/more_itertools-8.12.0-py3-none-any.whl
multidict @ file:///home/void/.cache/pypoetry/artifacts/12/d0/ef/0b7b9499611ab1817c3e84fb92c039a4fb5ac1d0397a3b86f4c5283108/multidict-6.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
multiprocess @ file:///home/void/.cache/pypoetry/artifacts/5d/fb/ce/029cafd015834319024c43f708d975b006d115de466309cb8f4d8c1353/multiprocess-0.70.12.2-py39-none-any.whl
murmurhash @ file:///home/void/.cache/pypoetry/artifacts/c5/62/3e/e9b9045eda9a62bac496552aaab7dca42fa7ed5eb8ad1e2e47334f2e7b/murmurhash-1.0.6-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
mypy @ file:///home/void/.cache/pypoetry/artifacts/f3/2e/4b/a6e385cc026a07a6e7f7ea6e399b95bf05152b89029fde1cbd9dc79777/mypy-0.931-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
mypy-extensions @ file:///home/void/.cache/pypoetry/artifacts/92/45/bf/1807ce854ff668d92602207a37bfa9316def2a3f257bd03c4c5be4bc9b/mypy_extensions-0.4.3-py2.py3-none-any.whl
nitpick @ file:///home/void/.cache/pypoetry/artifacts/e2/e5/2d/cc41b378b4b17d511941cc4a8c6726d43ae20ab8d95c811b1ffd4a0366/nitpick-0.31.0-py3-none-any.whl
nltk @ file:///home/void/.cache/pypoetry/artifacts/fa/3a/e7/039be37a8319e1d87eb9c6d4091d38c90ff404544574f52fe5fdf6a620/nltk-3.7-py3-none-any.whl
numpy @ file:///home/conda/feedstock_root/build_artifacts/numpy_1651020388495/work
packaging @ file:///home/void/.cache/pypoetry/artifacts/47/3f/ce/b240169f7d8bef1ff24a0269b709721ce86543c2ec25e0b6adb2c2d7ac/packaging-21.3-py3-none-any.whl
pandas @ file:///home/void/.cache/pypoetry/artifacts/31/04/ef/34f05c58bf930aa5a0a574177ab4757d1cdf3d8e1123ea306dcdb64739/pandas-1.4.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
parso @ file:///home/void/.cache/pypoetry/artifacts/4e/d3/c5/d91e200388f1ce44b74484eb39fbfd0797eb236741d3672a7d728668b2/parso-0.8.3-py2.py3-none-any.whl
pathtools @ file:///home/void/.cache/pypoetry/artifacts/ce/ff/c7/31da76336d55d51d979a50868616c867c7b2ea6f2d2084b8c744726ae7/pathtools-0.1.2.tar.gz
pathy @ file:///home/void/.cache/pypoetry/artifacts/50/12/78/e6a2a43271455193874ba977034f2eb9ce0c4993ce86418a8903962049/pathy-0.6.1-py3-none-any.whl
pbr @ file:///home/void/.cache/pypoetry/artifacts/84/55/59/eb463369b9b26da119a7363f15941299731f5758e62a5b5d7d6e99b19b/pbr-5.8.1-py2.py3-none-any.whl
pep8-naming @ file:///home/void/.cache/pypoetry/artifacts/c5/94/df/bcd60db349c70b5041763ab528cbf6864991142dc2a25426cc75f63068/pep8_naming-0.12.1-py2.py3-none-any.whl
Pillow==9.2.0
pluggy @ file:///home/void/.cache/pypoetry/artifacts/81/78/ca/13f743a3628faf5a0b7f021efb45f2193acba3a13663d498f6b34bf02e/pluggy-1.0.0-py2.py3-none-any.whl
preshed @ file:///home/void/.cache/pypoetry/artifacts/32/18/f6/d3ef3856138e839656c9506ee150937257b0be46dcefb42b33572b9a01/preshed-3.0.6-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
promise @ file:///home/void/.cache/pypoetry/artifacts/d6/c6/43/95f1e737b1dd79d3a5ac6cfb264a889716bab4cd9d28a9bc8c69591d53/promise-2.3.tar.gz
protobuf @ file:///home/void/.cache/pypoetry/artifacts/10/0c/35/601d986797bc33b873a6d389873300c327b39a92a29a18f1af0fba64e2/protobuf-3.20.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl
psutil @ file:///home/void/.cache/pypoetry/artifacts/19/ba/e2/d04bccece490beafb95d281e00c58ce8c6031bd5658389b25d0fee7b1f/psutil-5.9.0-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
py @ file:///home/void/.cache/pypoetry/artifacts/b3/5c/47/ba5a596e01a2b61fa2daa6a438252483ad8c04e6c99e5dc22eaf8a489a/py-1.11.0-py2.py3-none-any.whl
py-rouge @ file:///home/void/.cache/pypoetry/artifacts/31/31/4f/cc7585fdf5aec32c5b688726f52c2238f959caba4f6a65950f1a932745/py_rouge-1.1-py3-none-any.whl
pyarrow @ file:///home/void/.cache/pypoetry/artifacts/be/47/10/2402910ade8b5d730999bbd4d4d77a81edacf0c3c4c7ff9195a02eb368/pyarrow-7.0.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
pyasn1 @ file:///home/void/.cache/pypoetry/artifacts/7b/3a/54/42ce43b579bda01b9d79022fb733811594441e7a32e9f9a5a98f672bdc/pyasn1-0.4.8-py2.py3-none-any.whl
pyasn1-modules @ file:///home/void/.cache/pypoetry/artifacts/dd/b8/4f/b56433e0354274a31074995e01b8671751e9f0ed0001f5254e5b03a54f/pyasn1_modules-0.2.8-py2.py3-none-any.whl
pycodestyle @ file:///home/void/.cache/pypoetry/artifacts/80/25/e5/ced4effd2487693e96521aebb353ce58ad0d81417f71c480419ea8f44a/pycodestyle-2.8.0-py2.py3-none-any.whl
pydantic @ file:///home/void/.cache/pypoetry/artifacts/e1/bf/c1/7cdd43a1b6ea4e9cfd1e0e9e5b3276ad83c6b57194b4449da3e64fe6cf/pydantic-1.8.2-cp39-cp39-manylinux2014_x86_64.whl
pydocstyle @ file:///home/void/.cache/pypoetry/artifacts/75/e7/e5/1acad15a51efd39cf39259c7888c205fd787a92efea28f7afc5a9e315c/pydocstyle-6.1.1-py3-none-any.whl
pyflakes @ file:///home/void/.cache/pypoetry/artifacts/ad/8e/6d/e46117c96aa3e955c40ed2f0d35b29910b0e557e327fbbba17fd8c390c/pyflakes-2.4.0-py2.py3-none-any.whl
Pygments @ file:///home/void/.cache/pypoetry/artifacts/eb/89/d1/4faa7ff8e87642649c25d60a15480ea042f1cff3c237ad940149669cbf/Pygments-2.11.2-py3-none-any.whl
pyparsing @ file:///home/void/.cache/pypoetry/artifacts/8e/18/e1/a47fbcc66e38a452c34d2a444d74d8a03fd4aa7c0b6e27c0bee1582107/pyparsing-3.0.7-py3-none-any.whl
pytest @ file:///home/void/.cache/pypoetry/artifacts/0d/89/89/a248cc870a30ec312f23957c835d3c5d0a9a9b26cc85fab5453ac5a6da/pytest-7.1.1-py3-none-any.whl
pytest-cov @ file:///home/void/.cache/pypoetry/artifacts/50/df/c5/4bd35027c6247daac4ba547aff90d97ccdc7daf4aa62e110d69cfd39de/pytest_cov-3.0.0-py3-none-any.whl
pytest-randomly @ file:///home/void/.cache/pypoetry/artifacts/99/29/cd/a806a9ce497096d8fe2b4ef24aec6582ce95841657f5f9c574b99acd8f/pytest_randomly-3.11.0-py3-none-any.whl
python-dateutil @ file:///home/void/.cache/pypoetry/artifacts/53/f8/2a/7d63ce15df7386e9536e83413453f8aa845b47fb425f05c4ca2fb231c3/python_dateutil-2.8.2-py2.py3-none-any.whl
python-lsp-jsonrpc @ file:///home/void/.cache/pypoetry/artifacts/e6/0e/80/b725147ba93341249a4ba1659a62c1e5fdff33d830389c548e06f75065/python_lsp_jsonrpc-1.0.0-py3-none-any.whl
python-lsp-server @ file:///home/void/.cache/pypoetry/artifacts/6f/98/ad/d14c9cd61a8edae1c4107c73d2bf27362f447c1ed82b93fbf17ba86c6f/python_lsp_server-1.4.1-py3-none-any.whl
python-slugify @ file:///home/void/.cache/pypoetry/artifacts/26/60/f8/a441c9fdba2f1c5f992057fd43b6ccb5302a4886f17f4cde9d43a718f6/python_slugify-6.1.1-py2.py3-none-any.whl
pytz @ file:///home/void/.cache/pypoetry/artifacts/fb/60/46/e704d81037c87ab74c6677ae79ab43dda85c31e3ae38f53c44486b593a/pytz-2022.1-py2.py3-none-any.whl
PyYAML @ file:///home/void/.cache/pypoetry/artifacts/4e/12/e6/32a4a77023f06c3061d2fc6d4692aa531b8530e211b24ff1f77a39e6ee/PyYAML-6.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
regex @ file:///home/void/.cache/pypoetry/artifacts/a0/4f/99/f2ff47e7e38a845ffad9089c618d421234de54d7dcd88287a05c9a6d39/regex-2022.3.15-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
requests @ file:///home/void/.cache/pypoetry/artifacts/ff/f3/bc/a6781f93c2f9488431db494169bb514a083a1d77f3c325a277d8699398/requests-2.27.1-py2.py3-none-any.whl
responses @ file:///home/void/.cache/pypoetry/artifacts/dc/5e/50/a9a865ad978954b88e5eef7eb145cd6df70c17ab0eebcdb06139fdba71/responses-0.18.0-py3-none-any.whl
restructuredtext-lint @ file:///home/void/.cache/pypoetry/artifacts/04/32/b5/e5036c2c17882570082f8f5147166579794efaa6ebc2fc7ab5a0aea9cb/restructuredtext_lint-1.4.0.tar.gz
rouge-score==0.1.2
rsa @ file:///home/void/.cache/pypoetry/artifacts/29/ba/41/0ee0fcca877c94f32799d12775b513c1314a9712a5c2833dc5bacff2ab/rsa-4.8-py3-none-any.whl
ruamel.yaml @ file:///home/void/.cache/pypoetry/artifacts/7c/2c/48/111028a9e95097a5ff404a2dc3c1990c911bec7802cb01366c06a9919d/ruamel.yaml-0.17.21-py3-none-any.whl
ruamel.yaml.clib @ file:///home/void/.cache/pypoetry/artifacts/52/95/75/d1a001b469ea6ed66dd6f4194c9f7eb370223816cf6292be2a8213e9a4/ruamel.yaml.clib-0.2.6-cp39-cp39-manylinux1_x86_64.whl
s3transfer @ file:///home/void/.cache/pypoetry/artifacts/37/ce/4f/6e340791b443e17e70194b28ffae5c244fefaba8c5ec34fe26803ae32c/s3transfer-0.5.2-py3-none-any.whl
sacremoses @ file:///home/void/.cache/pypoetry/artifacts/fe/98/1e/44e689995403ecb78de0d342c0bd6b1ee909c7531b041787cfef263abd/sacremoses-0.0.49-py3-none-any.whl
safety @ file:///home/void/.cache/pypoetry/artifacts/8f/cc/5a/1dc0196c65cdb3f015bad143c000d159aca6f838a21d7d10872b092b83/safety-1.10.3-py2.py3-none-any.whl
scikit-learn @ file:///home/void/.cache/pypoetry/artifacts/c0/be/ed/effc57daf70ea24f0bf3ee842d9360a8d8b334360d6ede98084313f8ca/scikit_learn-1.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
scipy @ file:///home/void/.cache/pypoetry/artifacts/cf/54/79/6934fe0ee0382e865908197894c6ef46b1339d7fbe28cb76be7eba060c/scipy-1.6.1-cp39-cp39-manylinux1_x86_64.whl
sentencepiece @ file:///home/void/.cache/pypoetry/artifacts/82/c3/b5/7dcf4928f5159deaba8f451549bd8b063b13e500873857524fc8f893c9/sentencepiece-0.1.96-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
sentry-sdk @ file:///home/void/.cache/pypoetry/artifacts/53/27/b4/436602342845428610293b503dba983d9445cc8201b396a7027cfdf0e2/sentry_sdk-1.5.8-py2.py3-none-any.whl
setproctitle @ file:///home/void/.cache/pypoetry/artifacts/ad/8f/ef/df09cb137ec0407a725040b884910167a9a01e1c291f37466bfff3ed2f/setproctitle-1.2.2-cp39-cp39-manylinux1_x86_64.whl
shortuuid @ file:///home/void/.cache/pypoetry/artifacts/1b/c2/26/0a03ab3637895180121fd749a44e5005e71b8300cdd863c20ddbf0d318/shortuuid-1.0.8-py3-none-any.whl
six @ file:///home/void/.cache/pypoetry/artifacts/08/9f/47/c16ae03035fc69eaf100ea39657a49baaeef714e25a52575710c34cd48/six-1.16.0-py2.py3-none-any.whl
smart-open @ file:///home/void/.cache/pypoetry/artifacts/90/9d/8f/b3121f6940407c06e50f8f81646a84f4551ec214bb239013488c3492e8/smart_open-5.2.1-py3-none-any.whl
smmap @ file:///home/void/.cache/pypoetry/artifacts/76/3f/89/377e56f6e08e5b1fa88da762382b4c9c817c6dd24eae2e0e190898511d/smmap-5.0.0-py3-none-any.whl
snowballstemmer @ file:///home/void/.cache/pypoetry/artifacts/a4/95/b0/c0f70d4b9bb0bac123e716da53ba9b012071cedf7c99bcf030757530f4/snowballstemmer-2.2.0-py2.py3-none-any.whl
sortedcontainers @ file:///home/void/.cache/pypoetry/artifacts/b9/80/e1/4bdfa349488797fd308ecbe48f4fad57a3245777fb47c8741730583262/sortedcontainers-2.4.0-py2.py3-none-any.whl
spacy @ file:///home/void/.cache/pypoetry/artifacts/ec/99/e6/5494087874800c47d8e2703077d94c4d68a5e48052742b0ad1a99925d8/spacy-3.2.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
spacy-legacy @ file:///home/void/.cache/pypoetry/artifacts/e6/1b/06/c503f010541d1e7a9def3bdb2ed43c3f7d7d28a5431d7b62f40bb68d6d/spacy_legacy-3.0.9-py2.py3-none-any.whl
spacy-loggers @ file:///home/void/.cache/pypoetry/artifacts/4b/67/5c/cf7286e10b61c767780bbc5ec45dd12980a68d134bdc38f560e05deeb0/spacy_loggers-1.0.2-py3-none-any.whl
Sphinx @ file:///home/void/.cache/pypoetry/artifacts/b8/21/89/626dfd895b30a559af2369438bbb53f9b97470130e07400845f5b02e7a/Sphinx-4.5.0-py3-none-any.whl
sphinx-autodoc-typehints @ file:///home/void/.cache/pypoetry/artifacts/20/93/18/5a773d9276bc8ce1415caaeebec57d0a552e288507431207cd1dd05ce2/sphinx_autodoc_typehints-1.17.0-py3-none-any.whl
sphinxcontrib-applehelp @ file:///home/void/.cache/pypoetry/artifacts/8d/eb/86/eec708bb3ff50c9780e78f36a9cb82cd9ff8030a90bd23b9a6f20aecca/sphinxcontrib_applehelp-1.0.2-py2.py3-none-any.whl
sphinxcontrib-devhelp @ file:///home/void/.cache/pypoetry/artifacts/56/a5/74/11ccaa7737f06a10422027e0595b24d243af7a7a1dc4982dec22044c28/sphinxcontrib_devhelp-1.0.2-py2.py3-none-any.whl
sphinxcontrib-htmlhelp @ file:///home/void/.cache/pypoetry/artifacts/66/d0/cb/7228297c74d9280e7246b52187704724b0b0881e2762cdef34e04be778/sphinxcontrib_htmlhelp-2.0.0-py2.py3-none-any.whl
sphinxcontrib-jsmath @ file:///home/void/.cache/pypoetry/artifacts/d2/22/96/2076357e64b369910aa24a20d5b719beb24a1487146e4742476ee1e2d8/sphinxcontrib_jsmath-1.0.1-py2.py3-none-any.whl
sphinxcontrib-qthelp @ file:///home/void/.cache/pypoetry/artifacts/32/fc/a9/112a82396d53ec629c1450253a6ded4d94d7ffffd63acd49879543ece9/sphinxcontrib_qthelp-1.0.3-py2.py3-none-any.whl
sphinxcontrib-serializinghtml @ file:///home/void/.cache/pypoetry/artifacts/e6/9a/17/830e357f3aee36549c613a2d660b5cf38d70c27ecb7c218d15c7bfffe1/sphinxcontrib_serializinghtml-1.1.5-py2.py3-none-any.whl
sqgen==0.1.0
srsly @ file:///home/void/.cache/pypoetry/artifacts/89/a3/39/8c42cc259a1c485abc0d750eab45ade4628e27862bbc13e1b55e08fdcc/srsly-2.4.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
stevedore @ file:///home/void/.cache/pypoetry/artifacts/42/17/61/45c6b9d8b0d45dd19994d1f96aa784b4e9792c968eb24e21b0816a258b/stevedore-3.5.0-py3-none-any.whl
tensorboardX @ file:///home/void/.cache/pypoetry/artifacts/92/47/7e/7ec08edf2a5dd1d263fe84464c711bf81fb5ca0750298924e7cd25bfb1/tensorboardX-2.5-py2.py3-none-any.whl
termcolor @ file:///home/void/.cache/pypoetry/artifacts/a2/5d/c7/e4ccb3b3bb8d3e3aff995fb6ffb12cfc78bbc8affa283907ee5eb5a5a5/termcolor-1.1.0.tar.gz
testfixtures @ file:///home/void/.cache/pypoetry/artifacts/42/0d/ed/2ebf061a3336990c5502f46fcc245856d3b4410e3d4534faf557ff708e/testfixtures-6.18.5-py2.py3-none-any.whl
text-unidecode @ file:///home/void/.cache/pypoetry/artifacts/34/f9/c2/484c44b08bab89d472229bbd257fcc1d1c6273ee027f01cb08c4e3c309/text_unidecode-1.3-py2.py3-none-any.whl
thinc @ file:///home/void/.cache/pypoetry/artifacts/50/36/77/e54de1f1570d8fe6d5663eedf7dd2a586d09214798e14e0812d3da7a31/thinc-8.0.15-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
threadpoolctl @ file:///home/void/.cache/pypoetry/artifacts/a1/99/8f/521268b618d08b18c198e13411a00fa56ff22bc2129adfbec84c73bc21/threadpoolctl-3.1.0-py3-none-any.whl
tokenizers @ file:///home/void/.cache/pypoetry/artifacts/76/b7/1e/e41606a4aef09b047a3994b5bd08e7a6fe647f5e1dcfaa1acde0c74661/tokenizers-0.12.0-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
toml @ file:///home/void/.cache/pypoetry/artifacts/6b/6a/c9/53b19f7870a77d855e8b05ecdc98193944e5d246dafe11bbcad850ecba/toml-0.10.2-py2.py3-none-any.whl
tomli @ file:///home/void/.cache/pypoetry/artifacts/73/7c/d9/9f2752fc5b05f9176c6f3adc6484be1cec75a68925b8c5fe39d6493a07/tomli-2.0.1-py3-none-any.whl
tomlkit @ file:///home/void/.cache/pypoetry/artifacts/94/40/ed/49ed2b2bc80640b90ddc4d96266c9961325337428d29b7f21cd69c3e54/tomlkit-0.10.1-py3-none-any.whl
torch==1.10.2
torch-model-archiver @ file:///home/void/.cache/pypoetry/artifacts/40/42/a1/7dbfd7727f05615e362db3ba13d0e16e7d230058f7d61ad64edf24d35e/torch_model_archiver-0.5.3-1-py2.py3-none-any.whl
torchvision==0.11.3
tqdm @ file:///home/void/.cache/pypoetry/artifacts/4f/cb/23/ea1b6f00ee018ed18b62a59f236afc58992bea37bed1c52392070e0c20/tqdm-4.63.2-py2.py3-none-any.whl
transformers @ file:///home/void/.cache/pypoetry/artifacts/73/70/cc/f69c484ede493bcdc65a51165110ec666c9b103ab8916e9b237380fca5/transformers-4.17.0-py3-none-any.whl
typer @ file:///home/void/.cache/pypoetry/artifacts/6a/ab/b8/3c3bc10517987597baf2339967f15315a3b0fe479e307e387c032eb97d/typer-0.4.1-py3-none-any.whl
typing_extensions @ file:///home/void/.cache/pypoetry/artifacts/9b/2e/e0/f2f5524348ab0c59b31fc2526543d60a53583608f35a4574a4978f5358/typing_extensions-4.1.1-py3-none-any.whl
ujson @ file:///home/void/.cache/pypoetry/artifacts/46/87/2c/1be282cb0f7a001871d3b96036b7b5297671ba93fbfaf0aa196d408579/ujson-5.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
urllib3 @ file:///home/void/.cache/pypoetry/artifacts/86/4a/ad/37e91024d94f8218fc54b1fbb58b71888f53c1636daeb28593fa254ed2/urllib3-1.26.9-py2.py3-none-any.whl
wandb @ file:///home/void/.cache/pypoetry/artifacts/87/2d/6e/f52d6900867bd9439ac8c2322abb363e8c2db0ee332596cff4a435fcf3/wandb-0.12.11-py2.py3-none-any.whl
wasabi @ file:///home/void/.cache/pypoetry/artifacts/10/b9/ba/03f58e92aec7231ab5ce047f473f003232225cea36167ba6e5ff1d7e36/wasabi-0.9.1-py3-none-any.whl
wcwidth @ file:///home/void/.cache/pypoetry/artifacts/7d/f4/60/0737157bb9711fec72c70dff523aa54491eef317e0d586cf5388ff0908/wcwidth-0.2.5-py2.py3-none-any.whl
wemake-python-styleguide @ file:///home/void/.cache/pypoetry/artifacts/47/a2/24/991460310cb6782b1c07d99c5499aa92bd5a21148d808ab6be91764993/wemake_python_styleguide-0.16.1-py3-none-any.whl
word2number @ file:///home/void/.cache/pypoetry/artifacts/91/7b/91/fd4e6b1580eb2a2f0bb8b725ba137628acb0adb21522a3ff9d69e6f5e1/word2number-1.1.zip
xxhash @ file:///home/void/.cache/pypoetry/artifacts/6b/b3/33/e526ffaa46a1fe44c8214380090bf8e2e0d50d76f3d5dfc437b174227c/xxhash-3.0.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
yarl @ file:///home/void/.cache/pypoetry/artifacts/a3/6b/bb/af40f3ac1dfb8650848efd34d43176c01039c7203c8f8f615543baf9d2/yarl-1.7.2-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
yaspin @ file:///home/void/.cache/pypoetry/artifacts/08/96/72/a995df36daf20ca713ef9c496ef7cc2595253341a2ed64056cb8ddb82c/yaspin-2.1.0-py3-none-any.whl
zipp @ file:///home/void/.cache/pypoetry/artifacts/8a/08/12/1ce534dd356211524f826d50d2a2bfe2e99a8f3cac7355cac56639b06b/zipp-3.8.0-py3-none-any.whl

Steps to reproduce

Particularly I am measuring the time it takes to call rouge here like this.

start = timer()            
self._rouge(predictions, target_ids)            
end = timer()            
print('GPU Rouge timing:', end - start)

For a batch size of 64

This takes: GPU Rouge timing: 187.72676797099984 # That's more than 3 minutes.

I noticed that the rouge computation is done on the GPU and maybe that was slowing things down, so I decided to check

start = timer()            
self._rouge(predictions.to('cpu'), target_ids.to('cpu'))            
end = timer()            
print('CPU Rouge timing:', end - start)

This takes: CPU Rouge timing: 64.7368660709999 # Much faster but still very slow

So I went ahead and used the rouge implementations from the dataset library which is basically a wrapper around rouge_score by Google.

from datasets import load_dataset, load_metric
metric = load_metric("rouge")
start = timer()
pred_text = self._indexer._tokenizer.batch_decode(
    predictions.tolist(), skip_special_tokens=True
)
ref_text = self._indexer._tokenizer.batch_decode(
    target_ids.tolist(), skip_special_tokens=True
)
metric.compute(predictions=pred_text, references=ref_text, use_stemmer=True)
end = timer()
print('HFT Rouge timing:', end - start)

This takes:

HFT Rouge timing: 1.1103893849999622 # That's just one second

The rouge implementation used in allennlp here is 180 times slower than huggingface even though it doesn't have to perform the decoding step or perform stemming or compute the RougeLSum score.

I have used this model before, but I don't remember it being this slow. It seems like a recent thing.

Calling rouge for a single batch of size 64 should not be taking 3 minutes no matter what is going on internally.

bug

opened by vikigenius 8

Added for saving git info
Change proposed in this pull request: Saving git repo infos in meta class along with metadata of model archives after going through #4862 issue

Before submitting

[x] I've read and followed all steps in the Making a pull request section of the CONTRIBUTING docs.

[x] I've updated or added any relevant docstrings following the syntax described in the Writing docstrings section of the CONTRIBUTING docs.

[ ] If this PR fixes a bug, I've added a test that will fail without my fix.

[ ] If this PR adds a new feature, I've added tests that sufficiently cover my new functionality.

After submitting

[ ] All GitHub Actions jobs for my pull request have passed.

[ ] codecov/patch reports high test coverage (at least 90%). You can find this under the "Actions" tab of the pull request once the other checks have finished.
opened by Shreyz-max 9

structured-prediction-constituency-parser adds extra spaces

Checklist

[x] I have verified that the issue exists against the main branch of AllenNLP.
[x] I have read the relevant section in the contribution guide on reporting bugs.
[x] I have checked the issues list for similar or identical bug reports.
[x] I have checked the pull requests list for existing proposed fixes.
[x] I have checked the CHANGELOG and the commit log to find out if the bug was already fixed in the main branch.
[x] I have included in the "Description" section below a traceback from any exceptions related to this bug.
[x] I have included in the "Related issues or possible duplicates" section beloew all related issues and possible duplicate issues (If there are none, check this box anyway).
[x] I have included in the "Environment" section below the name of the operating system and Python version that I was using when I discovered this bug.
[x] I have included in the "Environment" section below the output of pip freeze.
[x] I have included in the "Steps to reproduce" section below a minimally reproducible example.

Description

The consistency parser adds white spaces before and after special characters like ".?-," and etc. For example, for the sentence: "Hi there, I'm LawlAoux." the output for the root is "Hi there , I 'm LawlAoux ."(full tree in details).

{'word': "Hi there , I 'm LawlAoux .", 'nodeType': 'S', 'attributes': ['S'], 'link': 'S', 'children': [{'word': 'Hi there', 'nodeType': 'INTJ', 'attributes': ['INTJ'], 'link': 'INTJ', 'children': [{'word': 'Hi', 'nodeType': 'UH', 'attributes': ['UH'], 'link': 'UH'}, {'word': 'there', 'nodeType': 'ADVP', 'attributes': ['ADVP'], 'link': 'ADVP', 'children': [{'word': 'there', 'nodeType': 'RB', 'attributes': ['RB'], 'link': 'RB'}]}]}, {'word': ',', 'nodeType': ',', 'attributes': [','], 'link': ','}, {'word': 'I', 'nodeType': 'NP', 'attributes': ['NP'], 'link': 'NP', 'children': [{'word': 'I', 'nodeType': 'PRP', 'attributes': ['PRP'], 'link': 'PRP'}]}, {'word': "'m LawlAoux", 'nodeType': 'VP', 'attributes': ['VP'], 'link': 'VP', 'children': [{'word': "'m", 'nodeType': 'VBP', 'attributes': ['VBP'], 'link': 'VBP'}, {'word': 'LawlAoux', 'nodeType': 'NP', 'attributes': ['NP'], 'link': 'NP', 'children': [{'word': 'LawlAoux', 'nodeType': 'NNP', 'attributes': ['NNP'], 'link': 'NNP'}]}]}, {'word': '.', 'nodeType': '.', 'attributes': ['.'], 'link': '.'}]}

As you can see, it adds spaces for the entire sentence for the root of the tree.

Related issues or possible duplicates

None

Environment

OS: OS X

Python version: 3.9.7

Output of pip freeze:

absl-py==1.1.0
agenda @ git+https://github.com/hyroai/[email protected]
aiohttp==3.8.1
aioredis==2.0.1
aiosignal==1.2.0
allennlp==2.9.3
allennlp-models==2.9.3
analysis @ git+https://gitlab.com/airbud/[email protected]
anyio==3.6.1
appdirs==1.4.4
appnope==0.1.3
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
arrow==1.2.2
asgiref==3.5.2
asttokens==2.0.5
astunparse==1.6.3
async-cache==1.1.1
async-generator==1.10
async-lru==1.0.3
async-timeout==4.0.2
asyncio==3.4.3
attrs==21.4.0
azure-common==1.1.28
azure-storage-blob==2.1.0
azure-storage-common==2.1.0
Babel==2.10.2
backcall==0.2.0
base58==2.1.1
beautifulsoup4==4.11.1
bleach==5.0.0
blis==0.7.7
boto3==1.24.8
botocore==1.27.8
breadability==0.1.20
cached-path==1.1.3
cachetools==4.2.4
catalogue==2.0.7
certifi==2022.5.18.1
cffi==1.15.0
cfgv==3.3.1
chardet==4.0.0
charset-normalizer==2.0.12
click==7.1.2
-e git+https://github.com/hyroai/[email protected]#egg=cloud_utils
commonmark==0.9.1
computation-graph==38
conllu==4.4.1
coverage==6.4.1
cryptography==37.0.2
cycler==0.11.0
cymem==2.0.6
dataclass-type-validator==0.1.2
dataclasses==0.6
dataclasses-json==0.5.5
datasets==2.2.2
dateparser==1.1.1
ddsketch==2.0.3
ddtrace==1.1.4
debugpy==1.6.0
decorator==5.1.1
defusedxml==0.7.1
Deprecated==1.2.13
dill==0.3.4
distlib==0.3.4
dnspython==2.2.1
docker-pycreds==0.4.0
docopt==0.6.2
docutils==0.18.1
ecdsa==0.17.0
emoji==1.7.0
en-core-web-lg @ https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.2.0/en_core_web_lg-3.2.0-py3-none-any.whl
en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.2.0/en_core_web_sm-3.2.0-py3-none-any.whl
entrypoints==0.4
et-xmlfile==1.1.0
execnet==1.9.0
executing==0.8.3
fairscale==0.4.6
falcon==3.1.0
fastapi==0.78.0
fastjsonschema==2.15.3
filelock==3.6.0
findimports==2.2.0
flaky==3.7.0
flatbuffers==1.12
fonttools==4.33.3
frozenlist==1.3.0
fsspec==2022.5.0
ftfy==6.1.1
fuzzyset2==0.1.1
gamla==132
gast==0.4.0
Geohash @ https://github.com/uriva/geohash/tarball/master
geotext==0.4.0
gitdb==4.0.9
GitPython==3.1.27
google-api-core==1.31.6
google-api-python-client==2.50.0
google-auth==1.35.0
google-auth-httplib2==0.1.0
google-auth-oauthlib==0.4.6
google-cloud-core==2.3.1
google-cloud-dlp==1.0.0
google-cloud-kms==1.4.0
google-cloud-storage==2.4.0
google-crc32c==1.3.0
google-pasta==0.2.0
google-resumable-media==2.3.3
googleapis-common-protos==1.56.2
grpc-google-iam-v1==0.12.4
grpcio==1.46.3
gspread==5.4.0
gunicorn==20.1.0
h11==0.13.0
h5py==3.7.0
heapq-max==0.21
html2text==2020.1.16
httpcore==0.13.2
httplib2==0.20.4
httptools==0.2.0
httpx==0.18.1
httpx-auth==0.10.0
huggingface-hub==0.7.0
humanize==4.1.0
identify==2.5.1
idna==3.3
immutables==0.18
importlib-metadata==4.11.4
inflect==5.6.0
iniconfig==1.1.1
ipykernel==6.14.0
ipython==8.4.0
ipython-genutils==0.2.0
ipywidgets==7.7.0
iso4217parse==0.5.1
jedi==0.18.1
Jinja2==3.1.2
jmespath==1.0.0
joblib==1.1.0
json5==0.9.8
jsonnet==0.18.0
jsonschema==4.6.0
jupyter-client==7.3.4
jupyter-core==4.10.0
jupyter-server==1.17.1
jupyterlab==3.4.3
jupyterlab-pygments==0.2.2
jupyterlab-server==2.14.0
jupyterlab-widgets==1.1.0
keras==2.9.0
Keras-Preprocessing==1.1.2
kiwisolver==1.4.3
-e git+https://github.com/hyroai/[email protected]#egg=knowledge_graph
kubernetes==23.6.0
langcodes==3.3.0
libclang==14.0.1
lmdb==1.3.0
lxml==4.9.0
marisa-trie==0.7.7
Markdown==3.3.7
markdownify==0.11.2
MarkupSafe==2.1.1
marshmallow==3.16.0
marshmallow-enum==1.5.1
matplotlib==3.5.2
matplotlib-inline==0.1.3
mistune==0.8.4
mona-sdk==0.0.24
more-itertools==8.13.0
motor==2.5.1
mpu==0.23.1
multidict==6.0.2
multiprocess==0.70.12.2
murmurhash==1.0.7
mypy-extensions==0.4.3
names-dataset==3.1.0
nbclassic==0.3.7
nbclient==0.6.4
nbconvert==6.5.0
nbformat==5.4.0
nest-asyncio==1.5.5
nltk==3.7
-e git+ssh://[email protected]/airbud/[email protected]#egg=nlu
nodeenv==1.6.0
nostril @ https://github.com/casics/nostril/tarball/master
notebook==6.4.12
notebook-shim==0.1.0
number-parser==0.2.1
numpy==1.22.4
oauth2client==4.1.3
oauthlib==3.2.0
openpyxl==3.0.10
opt-einsum==3.3.0
outcome==1.1.0
overrides==6.1.0
packaging==21.3
pandas==1.4.2
pandocfilters==1.5.0
parsec==3.13
parso==0.8.3
pathtools==0.1.2
pathy==0.6.1
pexpect==4.8.0
phonenumbers==8.12.24
pickleshare==0.7.5
Pillow==9.1.1
plac==1.3.5
platformdirs==2.5.2
plotly==5.8.2
pluggy==1.0.0
pre-commit==2.19.0
preshed==3.0.6
presidio-analyzer==2.2.28
prometheus-async==22.2.0
prometheus-client==0.14.1
promise==2.3
prompt-toolkit==3.0.29
protobuf==3.19.4
psutil==5.9.1
ptyprocess==0.7.0
pure-eval==0.2.2
py==1.11.0
py-rouge==1.1
pyap==0.3.1
pyarrow==8.0.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycld2==0.41
pycountry==22.3.5
pycparser==2.21
pydantic==1.7.4
pyee==8.2.2
Pygments==2.12.0
pyhumps==3.7.1
PyJWT==1.7.1
pymongo==3.12.3
pyOpenSSL==22.0.0
pyparsing==3.0.9
pyppeteer==1.0.2
pyrsistent==0.18.1
PySocks==1.7.1
pytest==7.1.2
pytest-asyncio==0.18.3
pytest-forked==1.4.0
pytest-instafail==0.4.2
pytest-mock==3.7.0
pytest-repeat==0.9.1
pytest-sugar==0.9.4
pytest-test-groups==1.0.3
pytest-timeout==2.1.0
pytest-tmnet @ https://testmon.org/static/c28870f08/pytest-tmnet-1.3.2.tar.gz
pytest-xdist==2.5.0
python-dateutil==2.8.2
python-dotenv==0.20.0
python-jose==3.3.0
python-Levenshtein==0.12.2
python-stdnum==1.17
pytimeparse==1.1.8
pytz==2022.1
pytz-deprecation-shim==0.1.0.post0
PyYAML==5.4.1
pyzmq==23.1.0
redis==4.3.3
regex==2022.6.2
requests==2.28.0
requests-file==1.5.1
requests-mock==1.9.3
requests-oauthlib==1.3.1
responses==0.18.0
retrying==1.3.3
rfc3986==1.5.0
rich==12.4.4
rsa==4.8
s3transfer==0.6.0
sacremoses==0.0.53
scikit-learn==1.1.1
scipy==1.8.1
selenium==4.2.0
Send2Trash==1.8.0
sentencepiece==0.1.96
sentry-sdk==1.5.12
setproctitle==1.2.3
shortuuid==1.0.9
six==1.16.0
sklearn==0.0
smart-open==5.2.1
smmap==5.0.0
sniffio==1.2.0
sortedcontainers==2.4.0
soupsieve==2.3.2.post1
spacy==3.2.4
spacy-legacy==3.0.9
spacy-loggers==1.0.2
srsly==2.4.3
stack-data==0.2.0
starlette==0.19.1
stringcase==1.2.0
sumy==0.10.0
tabulate==0.8.9
tenacity==8.0.1
tensorboard==2.9.1
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorboardX==2.5.1
tensorflow==2.9.1
tensorflow-estimator==2.9.0
tensorflow-hub==0.12.0
tensorflow-io-gcs-filesystem==0.26.0
termcolor==1.1.0
terminado==0.15.0
thinc==8.0.17
threadpoolctl==3.1.0
tinycss2==1.1.1
tldextract==3.1.0
tokenizers==0.12.1
toml==0.10.2
tomli==2.0.1
toolz==0.11.2
toposort==1.7
torch==1.11.0
torchvision==0.12.0
tornado==6.1
tqdm==4.64.0
traitlets==5.2.2.post1
transformers==4.18.0
trio==0.21.0
trio-websocket==0.9.2
typeguard==2.13.3
typer==0.4.1
typing-inspect==0.7.1
typing-utils==0.1.0
typing_extensions==4.2.0
tzdata==2022.1
tzlocal==4.2
undetected-chromedriver==3.1.5.post4
Unidecode==1.3.4
uritemplate==4.1.1
urllib3==1.26.9
uvicorn==0.15.0
uvloop==0.16.0
virtualenv==20.14.1
wandb==0.12.18
wasabi==0.9.1
watchgod==0.8.2
wcwidth==0.2.5
webdriver-manager==3.7.0
webencodings==0.5.1
websocket-client==1.3.2
websockets==10.3
Werkzeug==2.1.2
widgetsnbextension==3.6.0
word2number==1.1
wptools @ https://github.com/hyroai/wptools/tarball/master
wrapt==1.14.1
wsproto==1.1.0
xmltodict==0.13.0
xxhash==3.0.0
yappi==1.3.5
yarl==1.7.2
yattag==1.14.0
zipp==3.8.0

Steps to reproduce

Example source:

_PARSER = pretrained.load_predictor("structured-prediction-constituency-parser")
_PARSER("Hi there, I'm LawlAoux.")["hierplane_tree"]["root"]

bug Contributions welcome

opened by LawlAoux 5

Windows

Having worked at Vulcan for Paul Allen, he must be turning in his grave that a project with his name on it does not support the primary software of the company that he co-founded.

He would also be shocked to learn that this software does support the macs, which lack a GPU usable in deep learning rendering them 10 to 1000 times slower than a computer with AI industry standard GPUs.

Suggestion: Support Windows, the OS used by 80+% of business computers.
Contributions welcome Feature request

opened by Jeff-Winchell 4

Releases(v2.10.1)

v2.10.1(Oct 18, 2022)
What's new

Fixed ✅

Updated dependencies

Commits

c51707ed Add a shout to allennlp-light to the README 928df394 Be flexible about rich (#5719) d5f8e0c2 Update torch requirement from <1.12.0,>=1.10.0 to >=1.10.0,<1.13.0 (#5680) 9f879b09 Add flair as an alternative (#5712) b2eb036e Allowed transitions (#5706) c6b248f4 Relax requirements on protobuf to allow for minor changes and patches. (#5694) 8571d930 Add a list of alternatives for people to try (#5691)
Source code(tar.gz)
Source code(zip)
v2.10.0(Jul 14, 2022)
What's new

Added 🎉

Added metric FBetaVerboseMeasure which extends FBetaMeasure to ensure compatibility with logging plugins and add some options.

Added three sample weighting techniques to ConditionalRandomField by supplying three new subclasses: ConditionalRandomFieldWeightEmission, ConditionalRandomFieldWeightTrans, and ConditionalRandomFieldWeightLannoy.

Fixed ✅

Fix error from cached-path version update.

Commits

b7f1cb41 Prepare for release v2.10.0 5a3acbaa Implementation of Weighted CRF Tagger (handling unbalanced datasets) (#5676) 20df7cdd Disable dependabot, add notice of future sunsetting (#5685) 7bcbb5ad Update transformers requirement from <4.20,>=4.1 to >=4.1,<4.21 (#5675) 0770b00a Bump black from 22.3.0 to 22.6.0 (#5679) 38b9a9eb Update transformers requirement from <4.19,>=4.1 to >=4.1,<4.20 (#5636) ed89a2e2 Update twine requirement from <4.0.0,>=1.11.0 to >=1.11.0,<5.0.0 (#5669) 5ace70ef Bump actions/setup-python from 2 to 4 (#5662) bd6b0604 Update mkdocs-material requirement from <8.3.0,>=5.5.0 to >=5.5.0,<8.4.0 (#5655) 55663be7 Bump actions/checkout from 2 to 3 (#5647) fb92b76d Bump actions/upload-artifact from 1 to 3 (#5644) 775919a2 Bump actions/cache from 2 to 3 (#5645) f71eca96 Bump webfactory/ssh-agent from 0.4.1 to 0.5.4 (#5643) 62d413d9 Bump codecov/codecov-action from 1 to 3 (#5642) 428cb7d3 Bump actions/download-artifact from 1 to 3 (#5641) eac4829d Bump mypy from 0.960 to 0.961 (#5658) df9d7ca0 Make saliency interpreter GPU compatible (#5656) ea4a53ca Update cached_path version (#5665) a6271a31 FBetaMeasure metric with one value per key (#5638) 8b5ccc4a Bump mypy from 0.950 to 0.960 (#5639) 39b3c961 Dependabot GitHub Actions (#5640) 60fae31f Update filelock requirement from <3.7,>=3.3 to >=3.3,<3.8 (#5635) 67f32d3f Update spacy requirement from <3.3,>=2.1.0 to >=2.1.0,<3.4 (#5631) 2fd8dfa6 Bump mypy from 0.942 to 0.950 (#5630) d7409d26 Missing f prefix on f-strings fix (#5629)
Source code(tar.gz)
Source code(zip)
v2.9.3(Apr 14, 2022)
What's new

Added 🎉

Added verification_tokens argument to TestPretrainedTransformerTokenizer.

Fixed ✅

Updated various dependencies

Commits

0c4983a4 Prepare for release v2.9.3 426d894c Docspec2 (#5618) 1be8855f Add custom_dummy_tokens to PretrainedTransformerTokenizer (#5608) d21854b2 Update transformers requirement from <4.18,>=4.1 to >=4.1,<4.19 (#5617) 8684412a Bump mkdocs from 1.2.3 to 1.3.0 (#5609) 24e48a3f Bump docspec-python from 1.2.0 to 1.3.0 (#5611) edafff1f Bump docspec from 1.2.0 to 1.3.0 (#5610) b66e5f81 Bump black from 22.1.0 to 22.3.0 (#5613) 089d03c1 Bump mypy from 0.941 to 0.942 (#5606) dfb438a1 Bump mypy from 0.931 to 0.941 (#5599)
Source code(tar.gz)
Source code(zip)
v2.9.2(Mar 21, 2022)
What's new

Fixed ✅

Removed unnecessary dependencies

Restore functionality of CLI in absence of now-optional checklist-package

Commits

f6866f95 Fix CLI and install instructions in case optional checklists is not present (#5589) e1c6935c Update torch requirement from <1.11.0,>=1.6.0 to >=1.6.0,<1.12.0 (#5595) 5f5f8c30 Updated the docs for PytorchSeq2VecWrapper to specify that mask is required (#5386) 2426ce3d Dependencies (#5593) 2d9fe79f Bump fairscale from 0.4.5 to 0.4.6 (#5590) ab37da7b Update transformers requirement from <4.17,>=4.1 to >=4.1,<4.18 (#5583)
Source code(tar.gz)
Source code(zip)
v2.9.1(Mar 9, 2022)
What's new

Fixed ✅

Updated dependencies, especially around doc creation.

Running the test suite out-of-tree (e.g. after installation) is now possible by pointing the environment variable ALLENNLP_SRC_DIR to the sources.

Silenced a warning that happens when you inappropriately clone a tensor.

Adding more clarification to the Vocabulary documentation around min_pretrained_embeddings and only_include_pretrained_words.

Fixed bug with type mismatch caused by latest release of cached-path that now returns a Path instead of a str.

Added 🎉

We can now transparently read compressed input files during prediction.

LZMA compression is now supported.

Added a way to give JSON blobs as input to dataset readers in the evaluate command.

Added the argument sub_module in PretrainedTransformerMismatchedEmbedder

Changed ⚠️

You can automatically include all words from a pretrained file when building a vocabulary by setting the value in min_pretrained_embeddings to -1 for that particular namespace.

Commits

3547bfb8 pin cached-path tighter, make sure our cached-path wrapper still returns str (#5587) 99c93439 Clarify Vocabulary documentation, add -1 option for min_pretrained_embeddings (#5581) 3fa51933 Makes the evaluate command work for the multitask case (Second Edition) (#5579) 9f03803b Add "sub_module" argument in PretrainedTransformerMismatchedEmbedder (#5580) 92e54cce Open Compressed (#5578) 5b3352ce Clone warns (#5575) 9da4b0fe Add Wassterstein Distance calculation option for fairness metrics (#5546) b8f92f03 Update mkdocs-material requirement from <8.2.0,>=5.5.0 to >=5.5.0,<8.3.0 (#5572) a21c0b4c Update filelock requirement from <3.5,>=3.3 to >=3.3,<3.7 (#5571) 6614077b Make tests runnable out-of-tree for help with conda-packaging (#5560) e6792133 Fix CITATION.cff and add automatic validation of your citation metadata (#5561) efa9f1d0 try to unpin nltk (#5563) d01179b7 Small typo fix (#5555) 3c2299aa tighten test_sampled_equals_unsampled_when_biased_against_non_sampled_positions bound (#5549) e463084b Bump black from 21.12b0 to 22.1.0 (#5554) 8226e87d Making checklist optional (#5507) a76bf1e3 Update transformers requirement from <4.16,>=4.1 to >=4.1,<4.17 (#5553)
Source code(tar.gz)
Source code(zip)
v2.9.0(Jan 27, 2022)
What's new

Added 🎉

Added an Evaluator class to make comparing source, target, and predictions easier.

Added a way to resize the vocabulary in the T5 module

Added an argument reinit_modules to cached_transformers.get() that allows you to re-initialize the pretrained weights of a transformer model, using layer indices or regex strings.

Added attribute _should_validate_this_epoch to GradientDescentTrainer that controls whether validation is run at the end of each epoch.

Added ShouldValidateCallback that can be used to configure the frequency of validation during training.

Added a MaxPoolingSpanExtractor. This SpanExtractor represents each span by a component wise max-pooling-operation.

Fixed ✅

Fixed the docstring information for the FBetaMultiLabelMeasure metric.

Various fixes for Python 3.9

Fixed the name that the push-to-hf command uses to store weights.

FBetaMultiLabelMeasure now works with multiple dimensions

Support for inferior operating systems when making hardlinks

Use , as a separator for filenames in the evaluate command, thus allowing for URLs (eg. gs://...) as input files.

Removed a spurious error message "'torch.cuda' has no attribute '_check_driver'" that would be appear in the logs when a ConfigurationError for missing GPU was raised.

Load model on CPU post training to save GPU memory.

Fixed a bug in ShouldValidateCallback that leads to validation occuring after the first epoch regardless of validation_start value.

Fixed a bug in ShouldValidateCallback that leads to validation occuring every validation_interval + 1 epochs, instead of every validation_interval epochs.

Fixed a bug in ShouldValidateCallback that leads to validation never occuring at the end of training.

Removed 👋

Removed Tango components, since they now live at https://github.com/allenai/tango.

Removed dependency on the overrides package

Commits

dd5a010e Evaluator (#5445) 0b54fb0d Bump fairscale from 0.4.4 to 0.4.5 (#5545) 2deacfe5 Fix should validate callback train end (#5542) 2cdb8742 Bump mypy from 0.910 to 0.931 (#5538) a91946ae Keep NLTK down. They broke the download of omw. (#5540) 73a5cfc1 Removes stuff that now lives in the tango repo (#5482) 1278f16d Move changes from #5534 to correct place. (#5535) a7117035 Fix ShouldValidateCallback (#5536) b0b3ad4b Update mkdocs-material requirement from <8.1.0,>=5.5.0 to >=5.5.0,<8.2.0 (#5503) a3d71254 Max out span extractor (#5520) 515fe9b7 Configure validation frequency (#5534) d7e0c877 Update transformers requirement from <4.15,>=4.1 to >=4.1,<4.16 (#5528) 42332476 Bump fairscale from 0.4.3 to 0.4.4 (#5525) 71f2d797 fix 'check_for_gpu' (#5522) 06ec7f9a Reinit layers of pretrained transformer in cached_transformers.get() (#5505) ec1fb69f add missing nltk download in CI (#5529) ab4f7b5c Fix model loading on GPU post training (#5518) 3552842f Fix moving average args not rendering properly in docs (#5516) 87ad0061 Update transformers requirement from <4.13,>=4.1 to >=4.1,<4.15 (#5515) 39f4f4c1 tick version for nightly releases 38436d89 Use comma as filename separator (#5506) e0ee7f43 Dimensions in FBetaMultiLabelMeasure (#5501) d77ba3d6 Hardlink or copy (#5502) dbcbcf10 Add installation instructions through conda-forge (#5498) ebad9eeb Bump black from 21.11b1 to 21.12b0 (#5496) 82b1f4f8 Use the correct filename when uploading models to the HF Hub (#5499) 19f6c8f9 Resize T5 Vocab (#5497) c557d512 enforce reading in utf-8 encoding (#5476) 1caf0daf Removes dependency on the overrides package (#5490) b99376fe Python 3.9 (#5489) 666eaa56 Update mkdocs-material requirement from <7.4.0,>=5.5.0 to >=5.5.0,<8.1.0 (#5486) 64b2c078 Bump fairscale from 0.4.2 to 0.4.3 (#5474) 0a794c6b Fix metric docstring (#5475) f86ff9f4 Bump black from 21.10b0 to 21.11b1 (#5473) a7f6cdf1 update cached-path (#5477) 844acfa9 Update filelock requirement from <3.4,>=3.3 to >=3.3,<3.5 (#5469) 05fc7f62 Bump fairscale from 0.4.0 to 0.4.2 (#5461) 923dbde0 Bump black from 21.9b0 to 21.10b0 (#5453) 09e22aa6 Update spacy requirement from <3.2,>=2.1.0 to >=2.1.0,<3.3 (#5460) 54b92ae7 HF now raises ValueError (#5464)
Source code(tar.gz)
Source code(zip)
v2.8.0(Nov 1, 2021)
What's new

Added 🎉

Added support to push models directly to the Hugging Face Hub with the command allennlp push-to-hf.

More default tests for the TextualEntailmentSuite.

Changed ⚠️

The behavior of --overrides has changed. Previously the final configuration params were simply taken as the union over the original params and the --overrides params. But now you can use --overrides to completely replace any part of the original config. For example, passing --overrides '{"model":{"type":"foo"}}' will completely replace the "model" part of the original config. However, when you just want to change a single field in the JSON structure without removing / replacing adjacent fields, you can still use the "dot" syntax. For example, --overrides '{"model.num_layers":3}' will only change the num_layers parameter to the "model" part of the config, leaving everything else unchanged.

Integrated cached_path library to replace existing functionality in common.file_utils. This introduces some improvements without any breaking changes.

Fixed ✅

Fixed the implementation of PairedPCABiasDirection in allennlp.fairness.bias_direction, where the difference vectors should not be centered when performing the PCA.

Commits

7213d520 Update transformers requirement from <4.12,>=4.1 to >=4.1,<4.13 (#5452) 1b022270 bug fix (#5447) 0d8c0fc5 Update torch requirement from <1.10.0,>=1.6.0 to >=1.6.0,<1.11.0 (#5442) 0c79807c Checklist update (#5438) ebd6b5ba integrate cached_path (#5418) dcd8d9e9 Update mkdocs-material requirement from <7.3.0,>=5.5.0 to >=5.5.0,<7.4.0 (#5419) 362349b5 Registrable _to_params default functionality (#5403) 17ef1aa2 fix a bug when using fp16 training & gradient clipping (#5426) a63e28c2 Update transformers requirement from <4.11,>=4.1 to >=4.1,<4.12 (#5422) 603552fc Add utility function and command to push models to 🤗 Hub (#5370) e5d332a5 Update filelock requirement from <3.1,>=3.0 to >=3.0,<3.2 (#5421) 44155ac6 Make --overrides more flexible (#5399) 43fd9825 Fix PairedPCABiasDirection (#5396) 7785068a Bump black from 21.7b0 to 21.9b0 (#5408) a09d057c Update transformers requirement from <4.10,>=4.1 to >=4.1,<4.11 (#5393) 527e43d9 require Python>=3.7 (#5400) 5338bd8b Add scaling to tqdm bar when downloading files (#5397)
Source code(tar.gz)
Source code(zip)
v2.7.0(Sep 1, 2021)
What's new

Added 🎉

Added support to evaluate mutiple datasets and produce corresponding output files in the evaluate command.

Added more documentation to the learning rate schedulers to include a sample config object for how to use it.

Moved the pytorch learning rate schedulers wrappers to their own file called pytorch_lr_schedulers.py so that they will have their own documentation page.

Added a module allennlp.nn.parallel with a new base class, DdpAccelerator, which generalizes PyTorch's DistributedDataParallel wrapper to support other implementations. Two implementations of this class are provided. The default is TorchDdpAccelerator (registered at "torch"), which is just a thin wrapper around DistributedDataParallel. The other is FairScaleFsdpAccelerator, which wraps FairScale's FullyShardedDataParallel. You can specify the DdpAccelerator in the "distributed" section of a configuration file under the key "ddp_accelerator".

Added a module allennlp.nn.checkpoint with a new base class, CheckpointWrapper, for implementations of activation/gradient checkpointing. Two implentations are provided. The default implementation is TorchCheckpointWrapper (registered as "torch"), which exposes PyTorch's checkpoint functionality. The other is FairScaleCheckpointWrapper which exposes the more flexible checkpointing funtionality from FairScale.

The Model base class now takes a ddp_accelerator parameter (an instance of DdpAccelerator) which will be available as self.ddp_accelerator during distributed training. This is useful when, for example, instantiating submodules in your model's __init__() method by wrapping them with self.ddp_accelerator.wrap_module(). See the allennlp.modules.transformer.t5 for an example.

We now log batch metrics to tensorboard and wandb.

Added Tango components, to be explored in detail in a later post

Added ScaledDotProductMatrixAttention, and converted the transformer toolkit to use it

Added tests to ensure that all Attention and MatrixAttention implementations are interchangeable

Added a way for AllenNLP Tango to read and write datasets lazily.

Added a way to remix datasets flexibly

Added from_pretrained_transformer_and_instances constructor to Vocabulary

TransformerTextField now supports __len__.

Fixed ✅

Fixed a bug in ConditionalRandomField: transitions and tag_sequence tensors were not initialized on the desired device causing high CPU usage (see https://github.com/allenai/allennlp/issues/2884)

Fixed a mispelling: the parameter contructor_extras in Lazy() is now correctly called constructor_extras.

Fixed broken links in allennlp.nn.initializers docs.

Fixed bug in BeamSearch where last_backpointers was not being passed to any Constraints.

TransformerTextField can now take tensors of shape (1, n) like the tensors produced from a HuggingFace tokenizer.

tqdm lock is now set inside MultiProcessDataLoading when new workers are spawned to avoid contention when writing output.

ConfigurationError is now pickleable.

Checkpointer cleaning was fixed to work on Windows Paths

Multitask models now support TextFieldTensor in heads, not just in the backbone.

Fixed the signature of ScaledDotProductAttention to match the other Attention classes

allennlp commands will now catch SIGTERM signals and handle them similar to SIGINT (keyboard interrupt).

The MultiProcessDataLoader will properly shutdown its workers when a SIGTERM is received.

Fixed the way names are applied to Tango Step instances.

Fixed a bug in calculating loss in the distributed setting.

Fixed a bug when extending a sparse sequence by 0 items.

Changed ⚠️

The type of the grad_norm parameter of GradientDescentTrainer is now Union[float, bool], with a default value of False. False means gradients are not rescaled and the gradient norm is never even calculated. True means the gradients are still not rescaled but the gradient norm is calculated and passed on to callbacks. A float value means gradients are rescaled.

TensorCache now supports more concurrent readers and writers.

We no longer log parameter statistics to tensorboard or wandb by default.

Commits

48af9d34 Multiple datasets and output files support for the evaluate command (#5340) 60213cd7 Tiny tango tweaks (#5383) 28950215 improve signal handling and worker cleanup (#5378) b41cb3eb Fix distributed loss (#5381) 6355f073 Fix Checkpointer cleaner regex on Windows (#5361) 27da04cf Dataset remix (#5372) 75af38e0 Create Vocabulary from both pretrained transformers and instances (#5368) 5dc80a65 Adds a dataset that can be read and written lazily (#5344) 01e8a35a Improved Documentation For Learning Rate Schedulers (#5365) 8370cfa3 skip loading t5-base in CI (#5371) 13de38d1 Log batch metrics (#5362) 1f5c6e5b Use our own base images to build allennlp Docker images (#5366) bffdbfd1 Bugfix: initializing all tensors and parameters of the ConditionalRandomField model on the proper device (#5335) d45a2dab Make sure that all attention works the same (#5360) c1edaef8 Update google-cloud-storage requirement (#5357) 524244b6 Update wandb requirement from <0.12.0,>=0.10.0 to >=0.10.0,<0.13.0 (#5356) 90bf33b8 small fixes for tango (#5350) 2e11a15e tick version for nightly releases 311f1104 Tango (#5162) 1df2e517 Bump fairscale from 0.3.8 to 0.3.9 (#5337) b72bbfc9 fix constraint bug in beam search, clean up tests (#5328) ec3e2943 Create CITATION.cff (#5336) 8714aa0b This is a desperate attempt to make TensorCache a little more stable (#5334) fd429b2b Update transformers requirement from <4.9,>=4.1 to >=4.1,<4.10 (#5326) 1b5ef3a0 Update spacy requirement from <3.1,>=2.1.0 to >=2.1.0,<3.2 (#5305) 1f20513d TextFieldTensor in multitask models (#5331) 76f2487b set tqdm lock when new workers are spawned (#5330) 67add9d9 Fix ConfigurationError deserialization (#5319) 42d85298 allow TransformerTextField to take input directly from HF tokenizer (#5329) 64043ac6 Bump black from 21.6b0 to 21.7b0 (#5320) 32750550 Update mkdocs-material requirement from <7.2.0,>=5.5.0 to >=5.5.0,<7.3.0 (#5327) 5b1da908 Update links in initializers documentation (#5317) ca656fc6 FairScale integration (#5242)
Source code(tar.gz)
Source code(zip)
v2.6.0(Jul 19, 2021)
What's new

Added 🎉

Added on_backward training callback which allows for control over backpropagation and gradient manipulation.

Added AdversarialBiasMitigator, a Model wrapper to adversarially mitigate biases in predictions produced by a pretrained model for a downstream task.

Added which_loss parameter to ensure_model_can_train_save_and_load in ModelTestCase to specify which loss to test.

Added **kwargs to Predictor.from_path(). These key-word argument will be passed on to the Predictor's constructor.

The activation layer in the transformer toolkit now can be queried for its output dimension.

TransformerEmbeddings now takes, but ignores, a parameter for the attention mask. This is needed for compatibility with some other modules that get called the same way and use the mask.

TransformerPooler can now be instantiated from a pretrained transformer module, just like the other modules in the transformer toolkit.

TransformerTextField, for cases where you don't care about AllenNLP's advanced text handling capabilities.

Added TransformerModule._post_load_pretrained_state_dict_hook() method. Can be used to modify missing_keys and unexpected_keys after loading a pretrained state dictionary. This is useful when tying weights, for example.

Added an end-to-end test for the Transformer Toolkit.

Added vocab argument to BeamSearch, which is passed to each contraint in constraints (if provided).

Fixed ✅

Fixed missing device mapping in the allennlp.modules.conditional_random_field.py file.

Fixed Broken link in allennlp.fairness.fairness_metrics.Separation docs

Ensured all allennlp submodules are imported with allennlp.common.plugins.import_plugins().

Fixed IndexOutOfBoundsException in MultiOptimizer when checking if optimizer received any parameters.

Removed confusing zero mask from VilBERT.

Ensured ensure_model_can_train_save_and_load is consistently random.

Fixed weight tying logic in T5 transformer module. Previously input/output embeddings were always tied. Now this is optional, and the default behavior is taken from the config.tie_word_embeddings value when instantiating from_pretrained_module().

Implemented slightly faster label smoothing.

Fixed the docs for PytorchTransformerWrapper

Fixed recovering training jobs with models that expect get_metrics() to not be called until they have seen at least one batch.

Made the Transformer Toolkit compatible with transformers that don't start their positional embeddings at 0.

Weights & Biases training callback ("wandb") now works when resuming training jobs.

Changed ⚠️

Changed behavior of MultiOptimizer so that while a default optimizer is still required, an error is not thrown if the default optimizer receives no parameters.

Made the epsilon parameter for the layer normalization in token embeddings configurable.

Removed 👋

Removed TransformerModule._tied_weights. Weights should now just be tied directly in the __init__() method. You can also override TransformerModule._post_load_pretrained_state_dict_hook() to remove keys associated with tied weights from missing_keys after loading a pretrained state dictionary.

Commits

ef5400d5 make W&B callback resumable (#5312) 96293407 Update google-cloud-storage requirement (#5309) f8fad9fc Provide vocab as param to constraints (#5321) 56e1f49d Fix training Conditional Random Fields on GPU (#5313) (#5315) 3c1ac032 Update wandb requirement from <0.11.0,>=0.10.0 to >=0.10.0,<0.12.0 (#5316) 7d4a6726 Transformer Toolkit fixes (#5303) aaa816f7 Faster label smoothing (#5294) 436c52d5 Docs update for PytorchTransformerWrapper (#5295) 3d92ac43 Update google-cloud-storage requirement (#5296) 5378533f Fixes recovering when the model expects metrics to be ready (#5293) 7428155a ensure torch always up-to-date in CI (#5286) 3f307ee3 Update README.md (#5288) 672485fb only run CHANGELOG check when source files are modified (#5287) c6865d79 use smaller snapshot for HFHub integration test ad54d48f Bump mypy from 0.812 to 0.910 (#5283) 42d96dfa typo: missing "if" in drop_last doc (#5284) a246e277 TransformerTextField (#5280) 82053a98 Improve weight tying logic in transformer module (#5282) c936da9f Update transformers requirement from <4.8,>=4.1 to >=4.1,<4.9 (#5281) e8f816dd Update google-cloud-storage requirement (#5277) 86504e6b Making model test case consistently random (#5278) 5a7844b5 add kwargs to Predictor.from_path() (#5275) 8ad562e4 Update transformers requirement from <4.7,>=4.1 to >=4.1,<4.8 (#5273) c8b8ed36 Transformer toolkit updates (#5270) 6af9069d update Python environment setup in GitHub Actions (#5272) f1f51fc9 Adversarial bias mitigation (#5269) af101d67 Removes confusing zero mask from VilBERT (#5264) a1d36e67 Update torchvision requirement from <0.10.0,>=0.8.1 to >=0.8.1,<0.11.0 (#5266) e5468d96 Bump black from 21.5b2 to 21.6b0 (#5255) b37686f6 Update torch requirement from <1.9.0,>=1.6.0 to >=1.6.0,<1.10.0 (#5267) 5da5b5ba Upload code coverage reports from different jobs, other CI improvements (#5257) a6cfb122 added on_backward trainer callback (#5249) 8db45e87 Ensure all relevant allennlp submodules are imported with import_plugins() (#5246) 57df0e37 [Docs] Fixes broken link in Fairness_Metrics (#5245) 154f75d7 Bump black from 21.5b1 to 21.5b2 (#5236) 7a5106d5 tick version for nightly release
Source code(tar.gz)
Source code(zip)
v2.5.0(Jun 3, 2021)
🆕 AllenNLP v2.5.0 comes with a few big new features and improvements 🆕

There is a whole new module allennlp.fairness that contains implementations of fairness metrics, bias metrics, and bias mitigation tools for your models thanks to @ArjunSubramonian. For a great introduction, check out the corresponding chapter of the guide: https://guide.allennlp.org/fairness.

Another major addition is the allennlp.confidence_checks.task_checklists submodule, thanks to @AkshitaB, which provides an automated way to run behavioral tests of your models using the checklist library.

BeamSearch also has several new important features, including an easy to add arbitrary constraints, thanks to @danieldeutsch.

See below for a comprehensive list of updates 👇

What's new

Added 🎉

Added TaskSuite base class and command line functionality for running checklist test suites, along with implementations for SentimentAnalysisSuite, QuestionAnsweringSuite, and TextualEntailmentSuite. These can be found in the allennlp.confidence_checks.task_checklists module.

Added BiasMitigatorApplicator, which wraps any Model and mitigates biases by finetuning on a downstream task.

Added allennlp diff command to compute a diff on model checkpoints, analogous to what git diff does on two files.

Meta data defined by the class allennlp.common.meta.Meta is now saved in the serialization directory and archive file when training models from the command line. This is also now part of the Archive named tuple that's returned from load_archive().

Added nn.util.distributed_device() helper function.

Added allennlp.nn.util.load_state_dict helper function.

Added a way to avoid downloading and loading pretrained weights in modules that wrap transformers such as the PretrainedTransformerEmbedder and PretrainedTransformerMismatchedEmbedder. You can do this by setting the parameter load_weights to False. See PR #5172 for more details.

Added SpanExtractorWithSpanWidthEmbedding, putting specific span embedding computations into the _embed_spans method and leaving the common code in SpanExtractorWithSpanWidthEmbedding to unify the arguments, and modified BidirectionalEndpointSpanExtractor, EndpointSpanExtractor and SelfAttentiveSpanExtractor accordingly. Now, SelfAttentiveSpanExtractor can also embed span widths.

Added a min_steps parameter to BeamSearch to set a minimum length for the predicted sequences.

Added the FinalSequenceScorer abstraction to calculate the final scores of the generated sequences in BeamSearch.

Added shuffle argument to BucketBatchSampler which allows for disabling shuffling.

Added allennlp.modules.transformer.attention_module which contains a generalized AttentionModule. SelfAttention and T5Attention both inherit from this.

Added a Constraint abstract class to BeamSearch, which allows for incorporating constraints on the predictions found by BeamSearch, along with a RepeatedNGramBlockingConstraint constraint implementation, which allows for preventing repeated n-grams in the output from BeamSearch.

Added DataCollator for dynamic operations for each batch.

Changed ⚠️

Use dist_reduce_sum in distributed metrics.

Allow Google Cloud Storage paths in cached_path ("gs://...").

Renamed nn.util.load_state_dict() to read_state_dict to avoid confusion with torch.nn.Module.load_state_dict().

TransformerModule.from_pretrained_module now only accepts a pretrained model ID (e.g. "bert-base-case") instead of an actual torch.nn.Module. Other parameters to this method have changed as well.

Print the first batch to the console by default.

Renamed sanity_checks to confidence_checks (sanity_checks is deprecated and will be removed in AllenNLP 3.0).

Trainer callbacks can now store and restore state in case a training run gets interrupted.

VilBERT backbone now rolls and unrolls extra dimensions to handle input with > 3 dimensions.

BeamSearch is now a Registrable class.

Fixed ✅

When PretrainedTransformerIndexer folds long sequences, it no longer loses the information from token type ids.

Fixed documentation for GradientDescentTrainer.cuda_device.

Re-starting a training run from a checkpoint in the middle of an epoch now works correctly.

When using the "moving average" weights smoothing feature of the trainer, training checkpoints would also get smoothed, with strange results for resuming a training job. This has been fixed.

When re-starting an interrupted training job, the trainer will now read out the data loader even for epochs and batches that can be skipped. We do this to try to get any random number generators used by the reader or data loader into the same state as they were the first time the training job ran.

Fixed the potential for a race condition with cached_path() when extracting archives. Although the race condition is still possible if used with force_extract=True.

Fixed wandb callback to work in distributed training.

Fixed tqdm logging into multiple files with allennlp-optuna.

Commits

b92fd9a7 Contextualized bias mitigation (#5176) aa52a9a0 Checklist fixes (#5239) 62067973 Fix tqdm logging into multiple files with allennlp-optuna (#5235) b0aa1d45 Generalize T5 modules (#5166) 5b111d08 tick version for nightly release 39d7e5ae Make BeamSearch Registrable (#5231) c0142320 Add constraints to beam search (#5216) 98dae7f4 Emergency fix. I forgot to take this out. c5bff8ba Fixes Checkpointing (#5220) 3d5799d8 Roll backbone (#5229) babc450d Added DataCollator for dynamic operations for each batch. (#5221) d97ed401 Bump checklist from 0.0.10 to 0.0.11 (#5222) 12155c40 fix race condition when extracting files with cached_path (#5227) d6629772 cancel redundant GH Actions workflows (#5226) 2d8f3904 Fix W&B callback for distributed training (#5223) 59df2ad3 Update nr-interface requirement from <0.0.4 to <0.0.6 (#5213) 3e1b553b Bump black from 20.8b1 to 21.5b1 (#5195) d2840cba save meta data with model archives (#5209) bd941c6f added shuffle disable option in BucketBatchSampler (#5212) 3585c9fe Implementing abstraction to score final sequences in BeamSearch (#5208) 79d16af1 Add a min_steps parameter to BeamSearch (#5207) cf113d70 Changes and improvements to how we initialize transformer modules from pretrained models (#5200) cccb35de Rename sanity_checks to confidence_checks (#5201) db8ff675 Update transformers requirement from <4.6,>=4.1 to >=4.1,<4.7 (#5199) fd5c9e4c Bias Metrics (#5139) d9b19b69 Bias Mitigation and Direction Methods (#5130) 74737373 add diff command (#5109) d85c5c3a Explicitly pass serialization directory and local rank to trainer in train command (#5180) 96c3caf9 fix nltk downloads in install (#5189) b1b455a2 improve contributing guide / PR template (#5185) 7a260da9 fix cuda_device docs (#5188) 0bf590df Update Makefile (#5183) 3335700c Default print first batch (#5175) b533733a Refactor span extractors and unify forward. (#5160) 01b232fb Allow google cloud storage locations for cached_path (#5173) eb2ae30e Update README.md (#5165) 55efa683 fix dataclasses import (#5169) a463e0e7 Add way of skipping pretrained weights download (#5172) c71bb460 improve err msg for PolynomialDecay LR scheduler (#5143) 530dae43 Simplify metrics (#5154) 12f5b0f5 Run some slow tests on the self-hosted runner (#5161) 90915800 Fixes token type ids for folded sequences (#5149) 10400e02 Run checklist suites in AllenNLP (#5065) d11359ed make dist_reduce_sum work for tensors (#5147) 9184fbcb Fixes Backbone / Model MRO inconsistency (#5148)
Source code(tar.gz)
Source code(zip)
v2.4.0(Apr 23, 2021)
What's new

Added 🎉

Added a T5 implementation to modules.transformers.

Changed ⚠️

Weights & Biases callback can now work in anonymous mode (i.e. without the WANDB_API_KEY environment variable).

Fixed ✅

The GradientDescentTrainer no longer leaves stray model checkpoints around when it runs out of patience.

Fixed cached_path() for "hf://" files.

Commits

7c5cc98a Don't orphan checkpoints when we run out of patience (#5142) 6ec64596 allow W&B anon mode (#5110) 4e862a54 T5 (#4969) 7fc5a91f fix cached_path for hub downloads (#5141) f877fdc3 Fairness Metrics (#5093)
Source code(tar.gz)
Source code(zip)
v2.3.1(Apr 20, 2021)
What's new

Added 🎉

Added support for the HuggingFace Hub as an alternative way to handle loading files through cached_path(). Hub downloads should be made through the hf:// URL scheme.

Add new dimension to the interpret module: influence functions via the InfluenceInterpreter base class, along with a concrete implementation: SimpleInfluence.

Added a quiet parameter to the MultiProcessDataLoading that disables Tqdm progress bars.

The test for distributed metrics now takes a parameter specifying how often you want to run it.

Changed ⚠️

Updated CONTRIBUTING.md to remind reader to upgrade pip setuptools to avoid spaCy installation issues.

Fixed ✅

Fixed a bug with the ShardedDatasetReader when used with multi-process data loading (https://github.com/allenai/allennlp/issues/5132).

Commits

a84b9b1a Add cached_path support for HF hub (#5052) 24ec7db4 fix #5132 (#5134) 2526674f Update CONTRIBUTING.md (#5133) c2ffb101 Add influence functions to interpret module (#4988) 0c7d60bc Take the number of runs in the test for distributed metrics (#5127) 8be3828f fix docs CI
Source code(tar.gz)
Source code(zip)
v2.3.0(Apr 14, 2021)
What's new

Added 🎉

Ported the following Huggingface LambdaLR-based schedulers: ConstantLearningRateScheduler, ConstantWithWarmupLearningRateScheduler, CosineWithWarmupLearningRateScheduler, CosineHardRestartsWithWarmupLearningRateScheduler.

Added new sub_token_mode parameter to pretrained_transformer_mismatched_embedder class to support first sub-token embedding

Added a way to run a multi task model with a dataset reader as part of allennlp predict.

Added new eval_mode in PretrainedTransformerEmbedder. If it is set to True, the transformer is always run in evaluation mode, which, e.g., disables dropout and does not update batch normalization statistics.

Added additional parameters to the W&B callback: entity, group, name, notes, and wandb_kwargs.

Changed ⚠️

Sanity checks in the GradientDescentTrainer can now be turned off by setting the run_sanity_checks parameter to False.

Allow the order of examples in the task cards to be specified explicitly

histogram_interval parameter is now deprecated in TensorboardWriter, please use distribution_interval instead.

Memory usage is not logged in tensorboard during training now. ConsoleLoggerCallback should be used instead.

If you use the min_count parameter of the Vocabulary, but you specify a namespace that does not exist, the vocabulary creation will raise a ConfigurationError.

Documentation updates made to SoftmaxLoss regarding padding and the expected shapes of the input and output tensors of forward.

Moved the data preparation script for coref into allennlp-models.

If a transformer is not in cache but has override weights, the transformer's pretrained weights are no longer downloaded, that is, only its config.json file is downloaded.

SanityChecksCallback now raises SanityCheckError instead of AssertionError when a check fails.

jsonpickle removed from dependencies.

Improved the error message from Registrable.by_name() when the name passed does not match any registered subclassess. The error message will include a suggestion if there is a close match between the name passed and a registered name.

Fixed ✅

Fixed a bug where some Activation implementations could not be pickled due to involving a lambda function.

Fixed __str__() method on ModelCardInfo class.

Fixed a stall when using distributed training and gradient accumulation at the same time

Fixed an issue where using the from_pretrained_transformer Vocabulary constructor in distributed training via the allennlp train command would result in the data being iterated through unnecessarily.

Fixed a bug regarding token indexers with the InterleavingDatasetReader when used with multi-process data loading.

Fixed a warning from transformers when using max_length in the PretrainedTransformerTokenizer.

Removed 👋

Removed the stride parameter to PretrainedTransformerTokenizer. This parameter had no effect.

Commits

c80e1751 improve error message from Registrable class (#5125) aca16237 Update docstring for basic_classifier (#5124) 059a64fc remove jsonpickle from dependencies (#5121) 5fdce9ad fix bug with interleaving dataset reader (#5122) 6e1f34cb Predicting with a dataset reader on a multitask model (#5115) b34df73e specify 'truncation' to avoid transformers warning (#5120) 0ddd3d35 Add eval_mode argument to pretrained transformer embedder (#5111) 99415e36 additional W&B params (#5114) 6ee12123 Adding a metadata field to the basic classifier (#5104) 2e8c3e2f Add link to gallery and demo in README (#5103) de611008 Distributed training with gradient accumulation (#5100) fe2d6e5a vocab fix (#5099) d906175d Update transformers requirement from <4.5,>=4.1 to >=4.1,<4.6 (#5102) 99da3156 fix str method of ModelCardInfo (#5096) 29f00ee2 Added new parameter 'sub_token_mode' to 'pretrained_transformer_mismatched_embedder' class to support first sub-token embedding (#4363) (#5087) 6021f7d4 Avoid from_pretrained download of model weights (#5085) c3fb97eb add SanityCheckError class (#5092) decb875b Bring back run_sanity_checks parameter (#5091) 913fb8a4 Update mkdocs-material requirement from <7.1.0,>=5.5.0 to >=5.5.0,<7.2.0 (#5074) f82d3f11 remove lambdas from activations (#5083) bb703494 Replace master references with main in issue template (#5084) 87504c42 Ported Huggingface LambdaLR-based schedulers (#5082) 63a3b489 set transformer to evaluation mode (#5073) 542ce5d9 Move coref prep script (#5078) bf8e71e9 compare namespace in counter and min_count (#3644) 4baf19ab Arjuns/softmax loss documentation update (#5075) 59b92106 Allow example categories to be ordered (#5059) 3daa0baf tick version for nightly bb77bd10 fix date in CHANGELOG
Source code(tar.gz)
Source code(zip)
v2.2.0(Mar 26, 2021)
What's new

Added 🎉

Added WandBCallback class for Weights & Biases integration, registered as a callback under the name "wandb".

Added TensorBoardCallback to replace the TensorBoardWriter. Registered as a callback under the name "tensorboard".

Added NormalizationBiasVerification and SanityChecksCallback for model sanity checks.

SanityChecksCallback runs by default from the allennlp train command. It can be turned off by setting trainer.enable_default_callbacks to false in your config.

Added new method on Field class: .human_readable_repr() -> Any, and new method on Instance class: .human_readable_dict() -> JsonDict (@leo-liuzy).

Removed 👋

Removed TensorBoardWriter. Please use the TensorBoardCallback instead.

Changed ⚠️

Use attributes of ModelOutputs object in PretrainedTransformerEmbedder instead of indexing (@JohnGiorgi).

Added support for PyTorch version 1.8 and torchvision version 0.9 (@nelson-liu).

Model.get_parameters_for_histogram_tensorboard_logging is deprecated in favor of Model.get_parameters_for_histogram_logging.

Fixed ✅

Makes sure tensors that are stored in TensorCache always live on CPUs.

Fixed a bug where FromParams objects wrapped in Lazy() couldn't be pickled.

Fixed a bug where the ROUGE metric couldn't be picked.

Fixed a bug reported by https://github.com/allenai/allennlp/issues/5036 - we now keep our spacy POS tagger on (@leo-liuzy).

Commits

c5c9df58 refactor LogWriter, add W&B integration (#5061) 385124ad Keep Spacy PoS tagger on by default (#5066) 15b532fb Update transformers requirement from <4.4,>=4.1 to >=4.1,<4.5 (#5057) 3aafb927 clarify how predictions_to_labeled_instances work for targeted or non-targeted hotflip attack (#4957) b897e57c ensure ROUGE metric can be pickled (#5051) 91e4af94 fix pickle bug for Lazy FromParams (#5049) 5b57be29 Adding normalization bias verification (#4990) ce71901a Update torchvision requirement from <0.9.0,>=0.8.1 to >=0.8.1,<0.10.0 (#5041) 7f609901 Update torch requirement from <1.8.0,>=1.6.0 to >=1.6.0,<1.9.0 (#5037) 96415b2b Use HF Transformers output types (#5035) 0c36019c clean up (#5034) d2bf35d1 Add methods for human readable representation of fields and instances (#4986) a8b80069 Makes sure serialized tensors live on CPUs (#5026) a0edfae9 Add options to log inputs in trainer (#4970)

Thanks to @nelson-liu for making sure we stay on top of releases! 😜
Source code(tar.gz)
Source code(zip)
v1.5.0(Mar 1, 2021)
What's new

Added 🎉

Added a way to specify extra parameters to the predictor in an allennlp predict call.

Added a way to initialize a Vocabulary from transformers models.

Support spaCy v3

Changed ⚠️

Updated Paper and Dataset classes in ModelCard.

Commits

55ac96a0 re-write docs commit history on releases (#4968) c61178fa Update spaCy to 3.0 (#4953) be595dfd Ensure mean absolute error metric returns a float (#4983) 25562234 raise on HTTP errors in cached_path (#4984) e1839cfe Inputs to the FBetaMultiLabel metric were copied and pasted wrong (#4975) b5b72a06 Add method to vocab to instantiate from a pretrained transformer (#4958) 025a0b28 Allows specifying extra arguments for predictors (#4947) 24c9c995 adding ModelUsage, rearranging fields (#4952)
Source code(tar.gz)
Source code(zip)
v2.1.0(Feb 24, 2021)
What's new

Changed ⚠️

coding_scheme parameter is now deprecated in Conll2003DatasetReader, please use convert_to_coding_scheme instead.

Support spaCy v3

Added 🎉

Added ModelUsage to ModelCard class.

Added a way to specify extra parameters to the predictor in an allennlp predict call.

Added a way to initialize a Vocabulary from transformers models.

Added the ability to use Predictors with multitask models through the new MultiTaskPredictor.

Added an example for fields of type ListField[TextField] to apply_token_indexers API docs.

Added text_key and label_key parameters to TextClassificationJsonReader class.

Added MultiOptimizer, which allows you to use different optimizers for different parts of your model.

Fixed ✅

@Registrable.register(...) decorator no longer masks the decorated class's annotations

Ensured that MeanAbsoluteError always returns a float metric value instead of a Tensor.

Learning rate schedulers that rely on metrics from the validation set were broken in v2.0.0. This brings that functionality back.

Fixed a bug where the MultiProcessDataLoading would crash when num_workers > 0, start_method = "spawn", max_instances_in_memory not None, and batches_per_epoch not None.

Fixed documentation and validation checks for FBetaMultiLabelMetric.

Fixed handling of HTTP errors when fetching remote resources with cached_path(). Previously the content would be cached even when certain errors - like 404s - occurred. Now an HTTPError will be raised whenever the HTTP response is not OK.

Fixed a bug where the MultiTaskDataLoader would crash when num_workers > 0

Fixed an import error that happens when PyTorch's distributed framework is unavailable on the system.

Commits

7c6adeff Fix worker_info bug when num_workers > 0 (#5013) 9d88f8c5 Fixes predictors in the multitask case (#4991) 678518a0 Less opaque registrable annotations (#5010) 4b5fad46 Regex optimizer (#4981) f091cb9c fix error when torch.distributed not available (#5011) 5974f54e Revert "drop support for Python 3.6 (#5012)" (#5016) bdb0e20a Update mkdocs-material requirement from <6.3.0,>=5.5.0 to >=5.5.0,<7.1.0 (#5015) d535de67 Bump mypy from 0.800 to 0.812 (#5007) 099786cf Update responses requirement, remove pin on urllib3 (#4783) b8cfb95c re-write docs commit history on releases (#4968) c5c9edf0 Add text_key and label_key to TextClassificationJsonReader (#5005) a02f67da drop support for Python 3.6 (#5012) 0078c595 Update spaCy to 3.0 (#4953) be9537f6 Update CHANGELOG.md 828ee101 Update CHANGELOG.md 1cff6ad9 update README (#4993) f8b38075 Add ListField example to apply token indexers (#4987) 7961b8b7 Ensure mean absolute error metric returns a float (#4983) da4dba15 raise on HTTP errors in cached_path (#4984) d4926f5e Inputs to the FBetaMultiLabel metric were copied and pasted wrong (#4975) d2ae540d Update transformers requirement from <4.3,>=4.1 to >=4.1,<4.4 (#4967) bf8eeafe Add method to vocab to instantiate from a pretrained transformer (#4958) 9267ce7c Resize transformers word embeddings layer for additional_special_tokens (#4946) 52c23dd2 Introduce convert_to_coding_scheme and make coding_scheme deprecated in CoNLL2003DatasetReader (#4960) c418f84b Fixes recording validation metrics for learning rate schedulers that rely on it (#4959) 4535f5c8 adding ModelUsage, rearranging fields (#4952) 1ace4bbb fix bug with MultiProcessDataLoader (#4956) 6f222919 Allows specifying extra arguments for predictors (#4947) 2731db12 tick version for nightly release
Source code(tar.gz)
Source code(zip)
v2.0.1(Jan 29, 2021)
What's new

A couple minors fixes and additions since the 2.0 release.

Added 🎉

Added tokenizer_kwargs and transformer_kwargs arguments to PretrainedTransformerBackbone

Changed ⚠️

GradientDescentTrainer makes serialization_dir when it's instantiated, if it doesn't exist.

Fixed ✅

common.util.sanitize now handles sets.

Commits

caa497f3 Update GradientDescentTrainer to automatically create directory for serialization_dir (#4940) cd96d953 Sanitize set (#4945) f0ae9f3c Adding tokenizer_kwargs argument to PretrainedTransformerBackbone constructor. (#4944) 501b0ab4 Fixing papers and datasets (#4919) fa625ec0 Adding missing transformer_kwargs arg that was recently added to PretrainedTransformerEmbedder (#4941) 96ea4839 Add missing "Unreleased" section to CHANGELOG
Source code(tar.gz)
Source code(zip)
v1.4.1(Jan 29, 2021)
What's new

Note: This release is mainly for the AllenNLP demo.

Changed ⚠️

Updated Paper and Dataset classes in ModelCard.

Commits

14b717c8 Update GradientDescentTrainer to automatically create directory for serialization_dir (#4940) e262352f Fixing papers and datasets (#4919)
Source code(tar.gz)
Source code(zip)
v2.0.0(Jan 27, 2021)
AllenNLP v2.0.0 Release Notes

The 2.0 release of AllenNLP represents a major engineering effort that brings several exciting new features to the library, as well as a focus on performance.

If you're upgrading from AllenNLP 1.x, we encourage you to read our comprehensive upgrade guide.

Main new features

AllenNLP gets eyes 👀

One of the most exciting areas in ML research is multimodal learning, and AllenNLP is now taking its first steps in this direction with support for 2 tasks and 3 datasets in the vision + text domain. Check out our ViLBERT for VQA and Visual Entailment models, along with the VQAv2, Visual Entailment, and GQA dataset readers in allennlp-models.

Transformer toolkit

The transformer toolkit offers a collection of modules to experiment with various transformer architectures, such as SelfAttention, TransformerEmbeddings, TransformerLayer, etc. It also simplifies the way one can take apart the pretrained transformer weights for an existing module, and combine them in different ways. For instance, one can pull out the first 8 layers of bert-base-uncased to separately encode two text inputs, combine the representations in some way, and then use the last 4 layers on the combined representation (More examples can be found in allennlp.modules.transformer).

The toolkit also contains modules for bimodal architectures such as ViLBERT. Modules include BiModalEncoder, which encodes two modalities separately, and performs bi-directional attention (BiModalAttention) using a connection layer (BiModalConnectionLayer). The VisionTextModel class is an example of a model that uses these bimodal layers.

Multi-task learning

2.0 adds support for multi-task learning throughout the AllenNLP system. In multi-task learning, the model consists of a backbone that is common to all the tasks, and tends to be the larger part of the model, and multiple task-specific heads that use the output of the backbone to make predictions for a specific task. This way, the backbone gets many more training examples than you might have available for a single task, and can thus produce better representations, which makes all tasks benefit. The canonical example for this is BERT, where the backbone is made up of the transformer stack, and then there are multiple model heads that do classification, tagging, masked-token prediction, etc. AllenNLP 2.0 helps you build such models by giving you those abstractions. The MultiTaskDatasetReader can read datasets for multiple tasks at once. The MultiTaskDataloader loads the instances from the reader and makes batches. The trainer feeds these batches to a MultiTaskModel, which consists of a Backbone and multiple Heads. If you want to look at the details of how this works, we have an example config available at https://github.com/allenai/allennlp-models/blob/main/training_config/vision/vilbert_multitask.jsonnet.

Changes since v2.0.0rc1

Added 🎉

The TrainerCallback constructor accepts serialization_dir provided by Trainer. This can be useful for Logger callbacks those need to store files in the run directory.

The TrainerCallback.on_start() is fired at the start of the training.

The TrainerCallback event methods now accept **kwargs. This may be useful to maintain backwards-compability of callbacks easier in the future. E.g. we may decide to pass the exception/traceback object in case of failure to on_end() and this older callbacks may simply ignore the argument instead of raising a TypeError.

Added a TensorBoardCallback which wraps the TensorBoardWriter.

Changed ⚠️

The TrainerCallack.on_epoch() does not fire with epoch=-1 at the start of the training. Instead, TrainerCallback.on_start() should be used for these cases.

TensorBoardBatchMemoryUsage is converted from BatchCallback into TrainerCallback.

TrackEpochCallback is converted from EpochCallback into TrainerCallback.

Trainer can accept callbacks simply with name callbacks instead of trainer_callbacks.

TensorboardWriter renamed to TensorBoardWriter, and removed as an argument to the GradientDescentTrainer. In order to enable TensorBoard logging during training, you should utilize the TensorBoardCallback instead.

Removed 👋

Removed EpochCallback, BatchCallback in favour of TrainerCallback. The metaclass-wrapping implementation is removed as well.

Removed the tensorboard_writer parameter to GradientDescentTrainer. You should use the TensorBoardCallback now instead.

Fixed ✅

Now Trainer always fires TrainerCallback.on_end() so all the resources can be cleaned up properly.

Fixed the misspelling, changed TensoboardBatchMemoryUsage to TensorBoardBatchMemoryUsage.

We set a value to epoch so in case of firing TrainerCallback.on_end() the variable is bound. This could have lead to an error in case of trying to recover a run after it was finished training.

Commits since v2.0.0rc1

15300823 Log to TensorBoard through a TrainerCallback in GradientDescentTrainer (#4913) 8b95316b ci quick fix fa1dc7b8 Add link to upgrade guide to README (#4934) 7364da03 Fix parameter name in the documentation 00e3ff27 tick version for nightly release 67fa291c Merging vision into main (#4800) 65e50b30 Bump mypy from 0.790 to 0.800 (#4927) a7445357 fix mkdocs config (#4923) ed322eba A helper for distributed reductions (#4920) 9ab2bf03 add CUDA 10.1 Docker image (#4921) d82287e5 Update transformers requirement from <4.1,>=4.0 to >=4.0,<4.2 (#4872) 4183a49c Update mkdocs-material requirement from <6.2.0,>=5.5.0 to >=5.5.0,<6.3.0 (#4880) 54e85eee disable codecov annotations (#4902) 2623c4bf Making TrackEpochCallback an EpochCallback (#4893) 1d21c759 issue warning instead of failing when lock can't be acquired on a resource that exists in a read-only file system (#4867) ec197c3b Create pull_request_template.md (#4891) 9cf41b2f fix navbar link 9635af82 rename 'master' -> 'main' (#4887) d0a07fb3 docs: fix simple typo, multplication -> multiplication (#4883) d1f032d8 Moving modelcard and taskcard abstractions to main repo (#4881) 1fff7cae Update docker torch version (#4873) d2aea979 Fix typo in str (#4874) 6a8d425f add CombinedLearningRateScheduler (#4871) a3732d00 Fix cache volume (#4869) 832901e8 Turn superfluous warning to info when extending the vocab in the embedding matrix (#4854)
Source code(tar.gz)
Source code(zip)
v1.4.0(Jan 27, 2021)
What's new

Added 🎉

Added a FileLock class to common.file_utils. This is just like the FileLock from the filelock library, except that it adds an optional flag read_only_ok: bool, which when set to True changes the behavior so that a warning will be emitted instead of an exception when lacking write permissions on an existing file lock. This makes it possible to use the FileLock class on a read-only file system.

Added a new learning rate scheduler: CombinedLearningRateScheduler. This can be used to combine different LR schedulers, using one after the other.

Added an official CUDA 10.1 Docker image.

Moving ModelCard and TaskCard abstractions into the main repository.

Added a util function allennlp.nn.util.dist_reduce(...) for handling distributed reductions. This is especially useful when implementing a distributed Metric.

Changed ⚠️

'master' branch renamed to 'main'

Torch version bumped to 1.7.1 in Docker images.

Fixed ✅

Fixed typo with LabelField string representation: removed trailing apostrophe.

Vocabulary.from_files and cached_path will issue a warning, instead of failing, when a lock on an existing resource can't be acquired because the file system is read-only.

TrackEpochCallback is now a EpochCallback.

Commits

4de78ac0 Make CI run properly on the 1.x branch 65e50b30 Bump mypy from 0.790 to 0.800 (#4927) a7445357 fix mkdocs config (#4923) ed322eba A helper for distributed reductions (#4920) 9ab2bf03 add CUDA 10.1 Docker image (#4921) d82287e5 Update transformers requirement from <4.1,>=4.0 to >=4.0,<4.2 (#4872) 4183a49c Update mkdocs-material requirement from <6.2.0,>=5.5.0 to >=5.5.0,<6.3.0 (#4880) 54e85eee disable codecov annotations (#4902) 2623c4bf Making TrackEpochCallback an EpochCallback (#4893) 1d21c759 issue warning instead of failing when lock can't be acquired on a resource that exists in a read-only file system (#4867) ec197c3b Create pull_request_template.md (#4891) 9cf41b2f fix navbar link 9635af82 rename 'master' -> 'main' (#4887) d0a07fb3 docs: fix simple typo, multplication -> multiplication (#4883) d1f032d8 Moving modelcard and taskcard abstractions to main repo (#4881) 1fff7cae Update docker torch version (#4873) d2aea979 Fix typo in str (#4874) 6a8d425f add CombinedLearningRateScheduler (#4871) a3732d00 Fix cache volume (#4869) 832901e8 Turn superfluous warning to info when extending the vocab in the embedding matrix (#4854)
Source code(tar.gz)
Source code(zip)
v2.0.0rc1(Jan 22, 2021)
This is the first (and hopefully only) release candidate for AllenNLP 2.0. Please note that this is a release candidate, and the APIs are still subject to change until the final 2.0 release. We'll provide a detailed writeup with the final 2.0 release, including a migration guide. In the meantime, here are the headline features of AllenNLP 2.0:

Support for models that combine language and vision features

Transformer Toolkit, a suite of classes and components that make it easy to experiment with transformer architectures

A framework for multitask training

Revamped data loading, for improved performance and flexibility

What's new

Added 🎉

Added TensorCache class for caching tensors on disk

Added abstraction and concrete implementation for image loading

Added abstraction and concrete implementation for GridEmbedder

Added abstraction and demo implementation for an image augmentation module.

Added abstraction and concrete implementation for region detectors.

A new high-performance default DataLoader: MultiProcessDataLoading.

A MultiTaskModel and abstractions to use with it, including Backbone and Head. The MultiTaskModel first runs its inputs through the Backbone, then passes the result (and whatever other relevant inputs it got) to each Head that's in use.

A MultiTaskDataLoader, with a corresponding MultiTaskDatasetReader, and a couple of new configuration objects: MultiTaskEpochSampler (for deciding what proportion to sample from each dataset at every epoch) and a MultiTaskScheduler (for ordering the instances within an epoch).

Transformer toolkit to plug and play with modular components of transformer architectures.

Added a command to count the number of instances we're going to be training with

Added a FileLock class to common.file_utils. This is just like the FileLock from the filelock library, except that it adds an optional flag read_only_ok: bool, which when set to True changes the behavior so that a warning will be emitted instead of an exception when lacking write permissions on an existing file lock. This makes it possible to use the FileLock class on a read-only file system.

Added a new learning rate scheduler: CombinedLearningRateScheduler. This can be used to combine different LR schedulers, using one after the other.

Added an official CUDA 10.1 Docker image.

Moving ModelCard and TaskCard abstractions into the main repository.

Added a util function allennlp.nn.util.dist_reduce(...) for handling distributed reductions. This is especially useful when implementing a distributed Metric.

Changed ⚠️

DatasetReaders are now always lazy. This means there is no lazy parameter in the base class, and the _read() method should always be a generator.

The DataLoader now decides whether to load instances lazily or not. With the PyTorchDataLoader this is controlled with the lazy parameter, but with the MultiProcessDataLoading this is controlled by the max_instances_in_memory setting.

ArrayField is now called TensorField, and implemented in terms of torch tensors, not numpy.

Improved nn.util.move_to_device function by avoiding an unnecessary recursive check for tensors and adding a non_blocking optional argument, which is the same argument as in torch.Tensor.to().

If you are trying to create a heterogeneous batch, you now get a better error message.

Readers using the new vision features now explicitly log how they are featurizing images.

master_addr and master_port renamed to primary_addr and primary_port, respectively.

is_master parameter for training callbacks renamed to is_primary.

master branch renamed to main

Torch version bumped to 1.7.1 in Docker images.

Removed 👋

Removed nn.util.has_tensor.

Fixed ✅

The build-vocab command no longer crashes when the resulting vocab file is in the current working directory.

Fixed typo with LabelField string representation: removed trailing apostrophe.

Vocabulary.from_files and cached_path will issue a warning, instead of failing, when a lock on an existing resource can't be acquired because the file system is read-only.

TrackEpochCallback is now a EpochCallback.

Commits

9a4a424d Moves vision models to allennlp-models (#4918) 412896bc fix merge conflicts ed322eba A helper for distributed reductions (#4920) 9ab2bf03 add CUDA 10.1 Docker image (#4921) d82287e5 Update transformers requirement from <4.1,>=4.0 to >=4.0,<4.2 (#4872) 54973947 Multitask example (#4898) 0f00d4d4 resolve _read type (#4916) 5229da83 Toolkit decoder (#4914) 4183a49c Update mkdocs-material requirement from <6.2.0,>=5.5.0 to >=5.5.0,<6.3.0 (#4880) d7c9eab3 improve worker error handling in MultiProcessDataLoader (#4912) 94dd9cc7 rename 'master' -> 'primary' for distributed training (#4910) c9585afd fix imports in file_utils 03c7ffb5 Merge branch 'main' into vision effcc4e5 improve data loading docs (#4909) 2f545701 remove PyTorchDataLoader, add SimpleDataLoader for testing (#4907) 31ec6a59 MultiProcessDataLoader takes PathLike data_path (#4908) 5e3757b4 rename 'multi_process_*' -> 'multiprocess' for consistency (#4906) df36636e Data loading cuda device (#4879) aedd3be1 Toolkit: Cleaning up TransformerEmbeddings (#4900) 54e85eee disable codecov annotations (#4902) 2623c4bf Making TrackEpochCallback an EpochCallback (#4893) 1d21c759 issue warning instead of failing when lock can't be acquired on a resource that exists in a read-only file system (#4867) ec197c3b Create pull_request_template.md (#4891) 15d32da1 Make GQA work (#4884) fbab0bd9 import MultiTaskDataLoader to data_loaders/init.py (#4885) d1cc1469 Merge branch 'main' into vision abacc01b Adding f1 score (#4890) 9cf41b2f fix navbar link 9635af82 rename 'master' -> 'main' (#4887) d0a07fb3 docs: fix simple typo, multplication -> multiplication (#4883) d1f032d8 Moving modelcard and taskcard abstractions to main repo (#4881) f62b819f Make images easier to find for Visual Entailment (#4878) 1fff7cae Update docker torch version (#4873) 7a7c7ea8 Only cache, no featurizing (#4870) d2aea979 Fix typo in str (#4874) 1c72a302 Merge branch 'master' into vision 6a8d425f add CombinedLearningRateScheduler (#4871) 85d38ff6 doc fixes c4e3f77f Switch to torchvision for vision components 👀, simplify and improve MultiProcessDataLoader (#4821) 3da8e622 Merge branch 'master' into vision a3732d00 Fix cache volume (#4869) 832901e8 Turn superfluous warning to info when extending the vocab in the embedding matrix (#4854) 147fefe6 Merge branch 'master' into vision 87e35360 Make tests work again (#4865) d16a5c78 Merge remote-tracking branch 'origin/master' into vision 457e56ef Merge branch 'master' into vision c8521d80 Toolkit: Adding documentation and small changes for BiModalAttention (#4859) ddbc7404 gqa reader fixes during vilbert training (#4851) 50e50df6 Generalizing transformer layers (#4776) 52fdd755 adding multilabel option (#4843) 78871195 Other VQA datasets (#4834) e729e9a4 Added GQA reader (#4832) 52e9dd92 Visual entailment model code (#4822) 01f3a2db Merge remote-tracking branch 'origin/master' into vision 3be6c975 SNLI_VE dataset reader (#4799) b659e665 VQAv2 (#4639) c787230c Merge remote-tracking branch 'origin/master' into vision db2d1d38 Merge branch 'master' into vision 6bf19246 Merge branch 'master' into vision 167bcaae remove vision push trigger 75914650 Merge remote-tracking branch 'origin/master' into vision 22d4633c improve independence of vision components (#4793) 98018cca fix merge conflicts c7803150 fix merge conflicts 5d22ce69 Merge remote-tracking branch 'origin/master' into vision 602399c0 update with master ffafaf64 Multitask data loading and scheduling (#4625) 7c47c3a5 Merge branch 'master' into vision 12c8d1bf Generalizing self attention (#4756) 63f61f0c Merge remote-tracking branch 'origin/master' into vision b48347be Merge remote-tracking branch 'origin/master' into vision 81892db4 fix failing tests 98edd253 update torch requirement 8da35081 update with master cc53afec separating TransformerPooler as a new module (#4730) 4ccfa885 Transformer toolkit: BiModalEncoder now has separate num_attention_heads for both modalities (#4728) 91631ef9 Transformer toolkit (#4577) 677a9cec Merge remote-tracking branch 'origin/master' into vision 2985236f This should have been part of the previously merged PR c5d264ae Detectron NLVR2 (#4481) e39a5f62 Merge remote-tracking branch 'origin/master' into vision f1e46fdc Add MultiTaskModel (#4601) fa22f731 Merge remote-tracking branch 'origin/master' into vision 41872ae4 Merge remote-tracking branch 'origin/master' into vision f886fd06 Merge remote-tracking branch 'origin/master' into vision 191b641e make existing readers work with multi-process loading (#4597) d7124d4b fix len calculation for new data loader (#4618) 87463612 Merge branch 'master' into vision 319794a1 remove duplicate padding calculations in collate fn (#4617) de9165e1 rename 'node_rank' to 'global_rank' in dataset reader 'DistributedInfo' (#4608) 3d114197 Formatting updates for new version of black (#4607) cde06e62 Changelog 1b08fd62 ensure models check runs on right branch 44c8791c ensure vision CI runs on each commit (#4582) 95e82532 Merge branch 'master' into vision e74a7365 new data loading (#4497) 6f820050 Merge remote-tracking branch 'origin/master' into vision a7d45de1 Initializing a VilBERT model from a pre-trained transformer (#4495) 3833f7a5 Merge branch 'master' into vision 71d7cb4e Merge branch 'master' into vision 31379611 Merge remote-tracking branch 'origin/master' into vision 6cc508d7 Merge branch 'master' into vision f87df839 Merge remote-tracking branch 'origin/master' into vision 0bbe84b4 An initial VilBERT model for NLVR2 (#4423)
Source code(tar.gz)
Source code(zip)
v1.3.0(Dec 15, 2020)
What's new

Added 🎉

Added links to source code in docs.

Added get_embedding_layer and get_text_field_embedder to the Predictor class; to specify embedding layers for non-AllenNLP models.

Added Gaussian Error Linear Unit (GELU) as an Activation.

Changed ⚠️

Renamed module allennlp.data.tokenizers.token to allennlp.data.tokenizers.token_class to avoid this bug.

transformers dependency updated to version 4.0.1.

Fixed ✅

Fixed a lot of instances where tensors were first created and then sent to a device with .to(device). Instead, these tensors are now created directly on the target device.

Fixed issue with GradientDescentTrainer when constructed with validation_data_loader=None and learning_rate_scheduler!=None.

Fixed a bug when removing all handlers in root logger.

ShardedDatasetReader now inherits parameters from base_reader when required.

Fixed an issue in FromParams where parameters in the params object used to a construct a class were not passed to the constructor if the value of the parameter was equal to the default value. This caused bugs in some edge cases where a subclass that takes **kwargs needs to inspect kwargs before passing them to its superclass.

Improved the band-aid solution for segmentation faults and the "ImportError: dlopen: cannot load any more object with static TLS" by adding a transformers import.

Added safety checks for extracting tar files

Commits

d408f416 log import errors for default plugins (#4866) f2a53310 Adds a safety check for tar files (#4858) 84a36a06 Update transformers requirement from <3.6,>=3.4 to >=4.0,<4.1 (#4831) fdad31aa Add ability to specify the embedding layer if the model does not use TextFieldEmbedder (#4836) 41c52245 Improve the band-aid solution for seg faults and the static TLS error (#4846) 63b6d163 fix FromParams bug (#4841) 6c3238ec rename token.py -> token_class.py (#4842) cec92098 Several micro optimizations (#4833) 48a48652 Add GELU activation (#4828) 3e623658 Bugfix for attribute inheritance in ShardedDatasetReader (#4830) 458c4c2b fix the way handlers are removed from the root logger (#4829) 5b306585 Fix bug in GradientDescentTrainer when validation data is absent (#4811) f353c6ce add link to source code in docs (#4807) 0a832713 No Docker auth on PRs (#4802) ad8e8a09 no ssh setup on PRs (#4801)
Source code(tar.gz)
Source code(zip)
v1.2.2(Nov 17, 2020)
What's new

Added 🎉

Added Docker builds for other torch-supported versions of CUDA.

Adds allennlp-semparse as an official, default plugin.

Fixed ✅

GumbelSampler now sorts the beams by their true log prob.

Commits

023d9bcc Prepare for release v1.2.2 7b0826c1 push commit images for both CUDA versions 3cad5b41 fix AUC test (#4795) efde092d upgrade ssh-agent action (#4797) ec37dd46 Docker builds for other CUDA versions, improve CI (#4796) 0d8873cf doc link quickfix e4cc95ce improve plugin section in README (#4789) d99f7f8a ensure Gumbel sorts beams by true log prob (#4786) 9fe8d900 Makes the transformer cache work with custom kwargs (#4781) 1e7492d7 Update transformers requirement from <3.5,>=3.4 to >=3.4,<3.6 (#4784) f27ef38b Fixes pretrained embeddings for transformers that don't have end tokens (#4732)
Source code(tar.gz)
Source code(zip)
v1.2.1(Nov 11, 2020)
What's new

Added 🎉

Added an optional seed parameter to ModelTestCase.set_up_model which sets the random seed for random, numpy, and torch.

Added support for a global plugins file at ~/.allennlp/plugins.

Added more documentation about plugins.

Added sampler class and parameter in beam search for non-deterministic search, with several implementations, including MultinomialSampler, TopKSampler, TopPSampler, and GumbelMaxSampler. Utilizing GumbelMaxSampler will give Stochastic Beam Search.

Changed ⚠️

Pass batch metrics to BatchCallback.

Fixed ✅

Fixed a bug where forward hooks were not cleaned up with saliency interpreters if there was an exception.

Fixed the computation of saliency maps in the Interpret code when using mismatched indexing. Previously, we would compute gradients from the top of the transformer, after aggregation from wordpieces to tokens, which gives results that are not very informative. Now, we compute gradients with respect to the embedding layer, and aggregate wordpieces to tokens separately.

Fixed the heuristics for finding embedding layers in the case of RoBERTa. An update in the transformers library broke our old heuristic.

Fixed typo with registered name of ROUGE metric. Previously was rogue, fixed to rouge.

Fixed default masks that were erroneously created on the CPU even when a GPU is available.

Commits

04247faa support global plugins file, improve plugins docs (#4779) 9f7cc248 Add sampling strategies to beam search (#4768) f6fe8c6d pin urllib3 in dev reqs for responses (#4780) 764bbe2e Pass batch metrics to BatchCallback (#4764) dc3a4f67 clean up forward hooks on exception (#4778) fcc3a70b Fix: typo in metric, rogue -> rouge (#4777) b89320cd Set the device for an auto-created mask (#4774) 92a844a7 RoBERTa embeddings are no longer a type of BERT embeddings (#4771) 23f0a8a6 Ensure cnn_encoder respects masking (#4746) b4f1a7ab add seed option to ModelTestCase.set_up_model (#4769) b7cec515 Made Interpret code handle mismatched cases better (#4733) 9759b15f allow TextFieldEmbedder to have EmptyEmbedder that may not be in input (#4761)
Source code(tar.gz)
Source code(zip)
v1.2.0(Oct 29, 2020)
What's new

Changed ⚠️

Enforced stricter typing requirements around the use of Optional[T] types.

Changed the behavior of Lazy types in from_params methods. Previously, if you defined a Lazy parameter like foo: Lazy[Foo] = None in a custom from_params classmethod, then foo would actually never be None. This behavior is now different. If no params were given for foo, it will be None. You can also now set default values for foo like foo: Lazy[Foo] = Lazy(Foo). Or, if you want you want a default value but also want to allow for None values, you can write it like this: foo: Optional[Lazy[Foo]] = Lazy(Foo).

Added support for PyTorch version 1.7.

Fixed ✅

Made it possible to instantiate TrainerCallback from config files.

Fixed the remaining broken internal links in the API docs.

Fixed a bug where Hotflip would crash with a model that had multiple TokenIndexers and the input used rare vocabulary items.

Fixed a bug where BeamSearch would fail if max_steps was equal to 1.

Commits

7f85c74e fix docker build (#4762) cc9ac0f2 ensure dataclasses not installed in CI (#4754) 812ac570 Fix hotflip bug where vocab items were not re-encoded correctly (#4759) aeb6d362 revert samplers and fix bug when max_steps=1 (#4760) baca7545 Make returning token type id default in transformers intra word tokenization. (#4758) 5d6670ce Update torch requirement from <1.7.0,>=1.6.0 to >=1.6.0,<1.8.0 (#4753) 0ad228d4 a few small doc fixes (#4752) 71a98c2a stricter typing for Optional[T] types, improve handling of Lazy params (#4743) 27edfbf8 Add end+trainer callbacks to Trainer.from_partial_objects (#4751) b792c834 Fix device mismatch bug for categorical accuracy metric in distributed training (#4744)
Source code(tar.gz)
Source code(zip)
v1.2.0rc1(Oct 22, 2020)
What's new

Added 🎉

Added a warning when batches_per_epoch for the validation data loader is inherited from the train data loader.

Added a build-vocab subcommand that can be used to build a vocabulary from a training config file.

Added tokenizer_kwargs argument to PretrainedTransformerMismatchedIndexer.

Added tokenizer_kwargs and transformer_kwargs arguments to PretrainedTransformerMismatchedEmbedder.

Added official support for Python 3.8.

Added a script: scripts/release_notes.py, which automatically prepares markdown release notes from the CHANGELOG and commit history.

Added a flag --predictions-output-file to the evaluate command, which tells AllenNLP to write the predictions from the given dataset to the file as JSON lines.

Added the ability to ignore certain missing keys when loading a model from an archive. This is done by adding a class-level variable called authorized_missing_keys to any PyTorch module that a Model uses. If defined, authorized_missing_keys should be a list of regex string patterns.

Added FBetaMultiLabelMeasure, a multi-label Fbeta metric. This is a subclass of the existing FBetaMeasure.

Added ability to pass additional key word arguments to cached_transformers.get(), which will be passed on to AutoModel.from_pretrained().

Added an overrides argument to Predictor.from_path().

Added a cached-path command.

Added a function inspect_cache to common.file_utils that prints useful information about the cache. This can also be used from the cached-path command with allennlp cached-path --inspect.

Added a function remove_cache_entries to common.file_utils that removes any cache entries matching the given glob patterns. This can used from the cached-path command with allennlp cached-path --remove some-files-*.

Added logging for the main process when running in distributed mode.

Added a TrainerCallback object to support state sharing between batch and epoch-level training callbacks.

Added support for .tar.gz in PretrainedModelInitializer.

Added classes: nn/samplers/samplers.py with MultinomialSampler, TopKSampler, and TopPSampler for sampling indices from log probabilities

Made BeamSearch registrable.

Added top_k_sampling and type_p_sampling BeamSearch implementations.

Pass serialization_dir to Model and DatasetReader.

Added an optional include_in_archive parameter to the top-level of configuration files. When specified, include_in_archive should be a list of paths relative to the serialization directory which will be bundled up with the final archived model from a training run.

Changed ⚠️

Subcommands that don't require plugins will no longer cause plugins to be loaded or have an --include-package flag.

Allow overrides to be JSON string or dict.

transformers dependency updated to version 3.1.0.

When cached_path is called on a local archive with extract_archive=True, the archive is now extracted into a unique subdirectory of the cache root instead of a subdirectory of the archive's directory. The extraction directory is also unique to the modification time of the archive, so if the file changes, subsequent calls to cached_path will know to re-extract the archive.

Removed the truncation_strategy parameter to PretrainedTransformerTokenizer. The way we're calling the tokenizer, the truncation strategy takes no effect anyways.

Don't use initializers when loading a model, as it is not needed.

Distributed training will now automatically search for a local open port if the master_port parameter is not provided.

In training, save model weights before evaluation.

allennlp.common.util.peak_memory_mb renamed to peak_cpu_memory, and allennlp.common.util.gpu_memory_mb renamed to peak_gpu_memory, and they both now return the results in bytes as integers. Also, the peak_gpu_memory function now utilizes PyTorch functions to find the memory usage instead of shelling out to the nvidia-smi command. This is more efficient and also more accurate because it only takes into account the tensor allocations of the current PyTorch process.

Make sure weights are first loaded to the cpu when using PretrainedModelInitializer, preventing wasted GPU memory.

Load dataset readers in load_archive.

Updated AllenNlpTestCase docstring to remove reference to unittest.TestCase

Removed 👋

Removed common.util.is_master function.

Fixed ✅

Fixed a bug where the reported batch_loss metric was incorrect when training with gradient accumulation.

Class decorators now displayed in API docs.

Fixed up the documentation for the allennlp.nn.beam_search module.

Ignore *args when constructing classes with FromParams.

Ensured some consistency in the types of the values that metrics return.

Fix a PyTorch warning by explicitly providing the as_tuple argument (leaving it as its default value of False) to Tensor.nonzero().

Remove temporary directory when extracting model archive in load_archive at end of function rather than via atexit.

Fixed a bug where using cached_path() offline could return a cached resource's lock file instead of the cache file.

Fixed a bug where cached_path() would fail if passed a cache_dir with the user home shortcut ~/.

Fixed a bug in our doc building script where markdown links did not render properly if the "href" part of the link (the part inside the ()) was on a new line.

Changed how gradients are zeroed out with an optimization. See this video from NVIDIA at around the 9 minute mark.

Fixed a bug where parameters to a FromParams class that are dictionaries wouldn't get logged when an instance is instantiated from_params.

Fixed a bug in distributed training where the vocab would be saved from every worker, when it should have been saved by only the local master process.

Fixed a bug in the calculation of rouge metrics during distributed training where the total sequence count was not being aggregated across GPUs.

Fixed allennlp.nn.util.add_sentence_boundary_token_ids() to use device parameter of input tensor.

Be sure to close the TensorBoard writer even when training doesn't finish.

Fixed the docstring for PyTorchSeq2VecWrapper.

Commits

01644caf Pass serialization_dir to Model, DatasetReader, and support include_in_archive (#4713) 1f29f352 Update transformers requirement from <3.4,>=3.1 to >=3.1,<3.5 (#4741) 6bb9ce9a warn about batches_per_epoch with validation loader (#4735) 00bb6c59 Be sure to close the TensorBoard writer (#4731) 3f23938b Update mkdocs-material requirement from <6.1.0,>=5.5.0 to >=5.5.0,<6.2.0 (#4738) 10c11cea Fix typo in PretrainedTransformerMismatchedEmbedder docstring (#4737) 0e64b4d3 fix docstring for PyTorchSeq2VecWrapper (#4734) 006bab48 Don't use PretrainedModelInitializer when loading a model (#4711) ce14bdc0 Allow usage of .tar.gz with PretrainedModelInitializer (#4709) c14a056d avoid defaulting to CPU device in add_sentence_boundary_token_ids() (#4727) 24519fd9 fix typehint on checkpointer method (#4726) d3c69f75 Bump mypy from 0.782 to 0.790 (#4723) cccad29a Updated AllenNlpTestCase docstring (#4722) 3a85e359 add reasonable timeout to gpu checks job (#4719) 1ff0658c Added logging for the main process when running in distributed mode (#4710) b099b69c Add top_k and top_p sampling to BeamSearch (#4695) bc6f15ac Fixes rouge metric calculation corrected for distributed training (#4717) ae7cf85b automatically find local open port in distributed training (#4696) 321d4f48 TrainerCallback with batch/epoch/end hooks (#4708) 001e1f76 new way of setting env variables in GH Actions (#4700) c14ea40e Save checkpoint before running evaluation (#4704) 40bb47ad Load weights to cpu with PretrainedModelInitializer (#4712) 327188b8 improve memory helper functions (#4699) 90f00379 fix reported batch_loss (#4706) 39ddb523 CLI improvements (#4692) edcb6d34 Fix a bug in saving vocab during distributed training (#4705) 3506e3fd ensure parameters that are actual dictionaries get logged (#4697) eb7f2568 Add StackOverflow link to README (#4694) 17c3b84b Fix small typo (#4686) e0b2e265 display class decorators in API docs (#4685) b9a92842 Update transformers requirement from <3.3,>=3.1 to >=3.1,<3.4 (#4684) d9bdaa95 add build-vocab command (#4655) ce604f1f Update mkdocs-material requirement from <5.6.0,>=5.5.0 to >=5.5.0,<6.1.0 (#4679) c3b5ed74 zero grad optimization (#4673) 9dabf3fa Add missing tokenizer/transformer kwargs (#4682) 9ac6c76c Allow overrides to be JSON string or dict (#4680) 55cfb47b The truncation setting doesn't do anything anymore (#4672) 990c9c17 clarify conda Python version in README.md 97db5387 official support for Python 3.8 🐍 (#4671) 1e381bb0 Clean up the documentation for beam search (#4664) 11def8ea Update bug_report.md 97fe88d2 Cached path command (#4652) c9f376bf Update transformers requirement from <3.2,>=3.1 to >=3.1,<3.3 (#4663) e5e3d020 tick version for nightly releases b833f905 fix multi-line links in docs (#4660) d7c06fe7 Expose from_pretrained keyword arguments (#4651) 175c76be fix confusing distributed logging info (#4654) fbd2ccca fix numbering in RELEASE_GUIDE 2d5f24bd improve how cached_path extracts archives (#4645) 824f97d4 smooth out release process (#4648) c7b7c008 Feature/prevent temp directory retention (#4643) de5d68bc Fix tensor.nonzero() function overload warning (#4644) e8e89d5a add flag for saving predictions to 'evaluate' command (#4637) e4fd5a0c Multi-label F-beta metric (#4562) f0e7a78c Create Dependabot config file (#4635) 0e33b0ba Return consistent types from metrics (#4632) 2df364ff Update transformers requirement from <3.1,>=3.0 to >=3.0,<3.2 (#4621) 6d480aae Improve handling of **kwargs in FromParams (#4629) bf3206a2 Workaround for Python not finding imports in spawned processes (#4630)
Source code(tar.gz)
Source code(zip)
v1.1.0(Sep 8, 2020)
Highlights

Version 1.1 was mainly focused on bug fixes, but there are a few important new features such as gradient checkpointing with pretrained transformer embedders and official support for automatic mixed precision (AMP) training through the new torch.amp module.

Details

Added

Predictor.capture_model_internals() now accepts a regex specifying which modules to capture.

Added the option to specify requires_grad: false within an optimizer's parameter groups.

Added the file-friendly-logging flag back to the train command. Also added this flag to the predict, evaluate, and find-learning-rate commands.

Added an EpochCallback to track current epoch as a model class member.

Added the option to enable or disable gradient checkpointing for transformer token embedders via boolean parameter gradient_checkpointing.

Added a method to ModelTestCase for running basic model tests when you aren't using config files.

Added some convenience methods for reading files.

cached_path() can now automatically extract and read files inside of archives.

Added the ability to pass an archive file instead of a local directory to Vocab.from_files.

Added the ability to pass an archive file instead of a glob to ShardedDatasetReader.

Added a new "linear_with_warmup" learning rate scheduler.

Added a check in ShardedDatasetReader that ensures the base reader doesn't implement manual distributed sharding itself.

Added an option to PretrainedTransformerEmbedder and PretrainedTransformerMismatchedEmbedder to use a scalar mix of all hidden layers from the transformer model instead of just the last layer. To utilize this, just set last_layer_only to False.

Training metrics now include batch_loss and batch_reg_loss in addition to aggregate loss across number of batches.

Changed

Upgraded PyTorch requirement to 1.6.

Beam search now supports multi-layer decoders.

Replaced the NVIDIA Apex AMP module with torch's native AMP module. The default trainer (GradientDescentTrainer) now takes a use_amp: bool parameter instead of the old opt_level: str parameter.

Not specifying a cuda_device now automatically determines whether to use a GPU or not.

Discovered plugins are logged so you can see what was loaded.

allennlp.data.DataLoader is now an abstract registrable class. The default implementation remains the same, but was renamed to allennlp.data.PyTorchDataLoader.

BertPooler can now unwrap and re-wrap extra dimensions if necessary.

Removed

Removed the opt_level parameter to Model.load and load_archive. In order to use AMP with a loaded model now, just run the model's forward pass within torch's autocast context.

Fixed

Fixed handling of some edge cases when constructing classes with FromParams where the class accepts **kwargs.

Fixed division by zero error when there are zero-length spans in the input to a PretrainedTransformerMismatchedIndexer.

Improved robustness of cached_path when extracting archives so that the cache won't be corrupted if a failure occurs during extraction.

Fixed a bug with the average and evalb_bracketing_score metrics in distributed training.

Fixed a bug in distributed metrics that caused nan values due to repeated addition of an accumulated variable.

Fixed how truncation was handled with PretrainedTransformerTokenizer. Previously, if max_length was set to None, the tokenizer would still do truncation if the transformer model had a default max length in its config. Also, when max_length was set to a non-None value, several warnings would appear for certain transformer models around the use of the truncation parameter.

Fixed evaluation of all metrics when using distributed training.

Added a py.typed marker. Fixed type annotations in allennlp.training.util.

Fixed problem with automatically detecting whether tokenization is necessary. This affected primarily the Roberta SST model.

Improved help text for using the --overrides command line flag.

Removed unnecessary warning about deadlocks in DataLoader.

Fixed testing models that only return a loss when they are in training mode.

Fixed a bug in FromParams that caused silent failure in case of the parameter type being Optional[Union[...]].

Fixed a bug where the program crashes if evaluation_data_loader is a AllennlpLazyDataset.

Reduced the amount of log messages produced by allennlp.common.file_utils.

Fixed a bug where PretrainedTransformerEmbedder parameters appeared to be trainable in the log output even when train_parameters was set to False.

Fixed a bug with the sharded dataset reader where it would only read a fraction of the instances in distributed training.

Fixed checking equality of ArrayFields.

Fixed a bug where NamespaceSwappingField did not work correctly with .empty_field().

Put more sensible defaults on the huggingface_adamw optimizer.

Simplified logging so that all logging output always goes to one file.

Fixed interaction with the python command line debugger.

Log the grad norm properly even when we're not clipping it.

Fixed a bug where PretrainedModelInitializer fails to initialize a model with a 0-dim tensor

Fixed a bug with the layer unfreezing schedule of the SlantedTriangular learning rate scheduler.

Fixed a regression with logging in the distributed setting. Only the main worker should write log output to the terminal.

Pinned the version of boto3 for package managers (e.g. poetry).

Fixed issue #4330 by updating the tokenizers dependency.

Fixed a bug in TextClassificationPredictor so that it passes tokenized inputs to the DatasetReader in case it does not have a tokenizer.

reg_loss is only now returned for models that have some regularization penalty configured.

Fixed a bug that prevented cached_path from downloading assets from GitHub releases.

Fixed a bug that erroneously increased last label's false positive count in calculating fbeta metrics.

Tqdm output now looks much better when the output is being piped or redirected.

Small improvements to how the API documentation is rendered.

Only show validation progress bar from main process in distributed training.

Commits

dcc9cdc7 Prepare for release v1.1.0 aa750bec fix Average metric (#4624) e1aa57cf improve robustness of cached_path when extracting archives (#4622) 711afaa7 Fix division by zero when there are zero-length spans in MismatchedEmbedder. (#4615) be97943a Improve handling of **kwargs in FromParams (#4616) 187b24e5 add more tutorial links to README (#4613) e840a589 s/logging/logger/ (#4609) dbc3c3ff Added batched versions of scatter and fill to util.py (#4598) 2c54cf8b reformat for new version of black (#4605) 2dd335e4 batched_span_select now guarantees element order in each span (#4511) 62f554ff specify module names by a regex in predictor.capture_model_internals() (#4585) f464aa38 Bump markdown-include from 0.5.1 to 0.6.0 (#4586) d01cdff9 Update RELEASE_PROCESS.md to include allennlp-models (#4587) 3aedac97 Prepare for release v1.1.0rc4 87a61ad9 Bug fix in distributed metrics (#4570) 71a9a90d upgrade actions to [email protected] (#4573) bd9ee6a4 Give better usage info for overrides parameter (#4575) 0a456a75 Fix boolean and categorical accuracy for distributed (#4568) 85112746 add actions workflow for closing stale issues (#4561) de413065 Static type checking fixes (#4545) 5a07009b Fix RoBERTa SST (#4548) 351941f3 Only pin mkdocs-material to minor version, ignore specific patch version (#4556) 0ac13a4f fix CHANGELOG 3b86f588 Prepare for release v1.1.0rc3 44d28476 Metrics in distributed setting (#4525) 1d619659 Bump mkdocs-material from 5.5.3 to 5.5.5 (#4547) 5b977809 tick version for nightly releases b32608e3 add gradient checkpointing for transformer token embedders (#4544) f639336a Fix logger being created twice (#4538) 660fdaf2 Fix handling of max length with transformer tokenizers (#4534) 15e288f5 EpochCallBack for tracking epoch (#4540) 9209bc91 Bump mkdocs-material from 5.5.0 to 5.5.3 (#4533) bfecdc3e Ensure len(self.evaluation_data_loader) is not called (#4531) 5bc3b732 Fix typo in warning in file_utils (#4527) e80d7687 pin torch >= 1.6 73220d71 Prepare for release v1.1.0rc2 9415350d Update torch requirement from <1.6.0,>=1.5.0 to >=1.5.0,<1.7.0 (#4519) 146bd9ee Remove link to self-attention modules. (#4512) 24012823 add back file-friendly-logging flag (#4509) 54e5c83e closes #4494 (#4508) fa39d498 ensure call methods are rendered in docs (#4522) e53d1858 Bug fix for case when param type is Optional[Union...] (#4510) 14f63b77 Make sure we have a bool tensor where we expect one (#4505) 18a4eb34 add a requires_grad option to param groups (#4502) 6c848dfb Bump mkdocs-material from 5.4.0 to 5.5.0 (#4507) d73f8a91 More BART changes (#4500) 1cab3bfe Update beam_search.py (#4462) 478bf46c remove deadlock warning in DataLoader (#4487) 714334ad Fix reported loss: Bug fix in batch_loss (#4485) db20b1fb use longer tqdm intervals when output being redirected (#4488) 53eeec10 tick version for nightly releases d693cf1c PathLike (#4479) 2f878322 only show validation progress bar from main process (#4476) 9144918d Fix reported loss (#4477) 5c970833 fix release link in CHANGELOG and formatting in README 4eb97953 Prepare for release v1.1.0rc1 f195440b update 'Models' links in README (#4475) 9c801a3c add CHANGELOG to API docs, point to license on GitHub, improve API doc formatting (#4472) 69d2f03d Clean up Tqdm bars when output is being piped or redirected (#4470) 7b188c93 fixed bug that erronously increased last label's false positive count (#4473) 64db027d Skip ETag check if OSError (#4469) b9d011ef More BART changes (#4468) 7a563a8f add option to use scalar mix of all transformer layers (#4460) d00ad668 Minor tqdm and logging clean up (#4448) 6acf2058 Fix regloss logging (#4449) 8c32ddfd Fixing bug in TextClassificationPredictor so that it passes tokenized inputs to the DatasetReader (#4456) b9a91646 Update transformers requirement from <2.12,>=2.10 to >=2.10,<3.1 (#4446) 181ef5d2 pin boto3 to resolve some dependency issues (#4453) c75a1ebd ensure base reader of ShardedDatasetReader doesn't implement sharding itself (#4454) 8a05ad43 Update CONTRIBUTING.md (#4447) 5b988d63 ensure only rank 0 worker writes to terminal (#4445) 8482f022 fix bug with SlantedTriangular LR scheduler (#4443) e46a578e Update transformers requirement from <2.11,>=2.10 to >=2.10,<2.12 (#4411) 8229aca3 Fix pretrained model initialization (#4439) 60deece9 Fix type hint in text_field.py (#4434) 23e549e4 More multiple-choice changes (#4415) 6d0a4fd2 generalize DataLoader (#4416) acd99952 Automatic file-friendly logging (#4383) 637dbb15 fix README, pin mkdocs, update mkdocs-material (#4412) 9c4dfa54 small fix to pretrained transformer tokenizer (#4417) 84988b81 Log plugins discovered and filter out transformers "PyTorch version ... available" log message (#4414) 54c41fcc Adds the ability to automatically detect whether we have a GPU (#4400) 96ff5851 Changes from my multiple-choice work (#4368) eee15ca8 Assign an empty mapping array to empty fields of NamespaceSwappingField (#4403) aa2943e5 Bump mkdocs-material from 5.3.2 to 5.3.3 (#4398) 7fa7531c fix eq method of ArrayField (#4401) e104e441 Add test to ensure data loader yields all instances when batches_per_epoch is set (#4394) b6fd6978 fix sharded dataset reader (#4396) 30e5dbfc Bump mypy from 0.781 to 0.782 (#4395) b0ba2d4c update version 1d07cc75 Bump mkdocs-material from 5.3.0 to 5.3.2 (#4389) ffc51843 ensure Vocab.from_files and ShardedDatasetReader can handle archives (#4371) 20afe6ce Add Optuna integrated badge to README.md (#4361) ba79f146 Bump mypy from 0.780 to 0.781 (#4390) 85e531c2 Update README.md (#4385) c2ecb7a2 Add a method to ModelTestCase for use without config files (#4381) 6852deff pin some doc building requirements (#4386) bf422d56 Add github template for using your own python run script (#4380) ebde6e85 Bump overrides from 3.0.0 to 3.1.0 (#4375) e52b7518 ensure transformer params are frozen at initialization when train_parameters is false (#4377) 3e8a9ef6 Add link to new template repo for config file development (#4372) 4f70bc93 tick version for nightly releases 63a5e158 Update spacy requirement from <2.3,>=2.1.0 to >=2.1.0,<2.4 (#4370) ef7c75b8 reduce amount of log messages produced by file_utils (#4366)
Source code(tar.gz)
Source code(zip)
v1.1.0rc4(Aug 20, 2020)
Changes since v1.1.0rc3

Added

Added a workflow to GitHub Actions that will automatically close unassigned stale issues and ping the assignees of assigned stale issues.

Fixed

Fixed a bug in distributed metrics that caused nan values due to repeated addition of an accumulated variable.

Commits

87a61ad9 Bug fix in distributed metrics (#4570) 71a9a90d upgrade actions to [email protected] (#4573) bd9ee6a4 Give better usage info for overrides parameter (#4575) 0a456a75 Fix boolean and categorical accuracy for distributed (#4568) 85112746 add actions workflow for closing stale issues (#4561) de413065 Static type checking fixes (#4545) 5a07009b Fix RoBERTa SST (#4548) 351941f3 Only pin mkdocs-material to minor version, ignore specific patch version (#4556)
Source code(tar.gz)
Source code(zip)
v1.1.0rc3(Aug 12, 2020)
Changes since v1.1.0rc2

Fixed

Fixed how truncation was handled with PretrainedTransformerTokenizer. Previously, if max_length was set to None, the tokenizer would still do truncation if the transformer model had a default max length in its config. Also, when max_length was set to a non-None value, several warnings would appear for certain transformer models around the use of the truncation parameter.

Fixed evaluation of all metrics when using distributed training.

Commits

0ac13a4f fix CHANGELOG 3b86f588 Prepare for release v1.1.0rc3 44d28476 Metrics in distributed setting (#4525) 1d619659 Bump mkdocs-material from 5.5.3 to 5.5.5 (#4547) 5b977809 tick version for nightly releases b32608e3 add gradient checkpointing for transformer token embedders (#4544) f639336a Fix logger being created twice (#4538) 660fdaf2 Fix handling of max length with transformer tokenizers (#4534) 15e288f5 EpochCallBack for tracking epoch (#4540) 9209bc91 Bump mkdocs-material from 5.5.0 to 5.5.3 (#4533) bfecdc3e Ensure len(self.evaluation_data_loader) is not called (#4531) 5bc3b732 Fix typo in warning in file_utils (#4527) e80d7687 pin torch >= 1.6
Source code(tar.gz)
Source code(zip)
v1.1.0rc2(Jul 31, 2020)
What's new since v1.1.0rc1

Changed

Upgraded PyTorch requirement to 1.6.

Replaced the NVIDIA Apex AMP module with torch's native AMP module. The default trainer (GradientDescentTrainer) now takes a use_amp: bool parameter instead of the old opt_level: str parameter.

Fixed

Removed unnecessary warning about deadlocks in DataLoader.

Fixed testing models that only return a loss when they are in training mode.

Fixed a bug in FromParams that caused silent failure in case of the parameter type being Optional[Union[...]].

Added

Added the option to specify requires_grad: false within an optimizer's parameter groups.

Added the file-friendly-logging flag back to the train command. Also added this flag to the predict, evaluate, and find-learning-rate commands.

Removed

Removed the opt_level parameter to Model.load and load_archive. In order to use AMP with a loaded model now, just run the model's forward pass within torch's autocast context.

Commits

73220d71 Prepare for release v1.1.0rc2 9415350d Update torch requirement from <1.6.0,>=1.5.0 to >=1.5.0,<1.7.0 (#4519) 146bd9ee Remove link to self-attention modules. (#4512) 24012823 add back file-friendly-logging flag (#4509) 54e5c83e closes #4494 (#4508) fa39d498 ensure call methods are rendered in docs (#4522) e53d1858 Bug fix for case when param type is Optional[Union...] (#4510) 14f63b77 Make sure we have a bool tensor where we expect one (#4505) 18a4eb34 add a requires_grad option to param groups (#4502) 6c848dfb Bump mkdocs-material from 5.4.0 to 5.5.0 (#4507) d73f8a91 More BART changes (#4500) 1cab3bfe Update beam_search.py (#4462) 478bf46c remove deadlock warning in DataLoader (#4487) 714334ad Fix reported loss: Bug fix in batch_loss (#4485) db20b1fb use longer tqdm intervals when output being redirected (#4488) 53eeec10 tick version for nightly releases d693cf1c PathLike (#4479) 2f878322 only show validation progress bar from main process (#4476) 9144918d Fix reported loss (#4477) 5c970833 fix release link in CHANGELOG and formatting in README
Source code(tar.gz)
Source code(zip)

An open-source NLP research library, built on PyTorch.

Related tags

Overview

Quick Links

Getting Started Using the Library

Plugins

Package Overview

Installation

Installing via pip

Setting up a virtual environment

Installing the library and dependencies

Installing using Docker

Building your own Docker image

Installing from source

Running AllenNLP

Issues

Contributions

Citing

Team

Comments

Notes:

Checklist

Description

Related issues or possible duplicates

Environment

Steps to reproduce

Checklist

Description

Related issues or possible duplicates

Environment

Steps to reproduce

Patching CVE-2007-4559

Checklist

Description

Related issues or possible duplicates

Environment

Steps to reproduce

Checklist

Description

Related issues or possible duplicates

Environment

Steps to reproduce

Before submitting

After submitting

Checklist

Description

Related issues or possible duplicates

Environment

Steps to reproduce

Releases(v2.10.1)

v2.10.1(Oct 18, 2022)

What's new

Fixed ✅

Commits

v2.10.0(Jul 14, 2022)

What's new

Added 🎉

Fixed ✅

Commits

v2.9.3(Apr 14, 2022)

What's new

Added 🎉

Fixed ✅

Commits

v2.9.2(Mar 21, 2022)

What's new

Fixed ✅

Commits

v2.9.1(Mar 9, 2022)

What's new

Fixed ✅

Added 🎉

Changed ⚠️

Commits

v2.9.0(Jan 27, 2022)

What's new

Added 🎉

Fixed ✅

Removed 👋

Commits