An open-source NLP research library, built on PyTorch.

Overview

An Apache 2.0 NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks.


CI PyPI License Codecov Optuna

Quick Links

Getting Started Using the Library

If you're interested in using AllenNLP for model development, we recommend you check out the AllenNLP Guide. When you're ready to start your project, we've created a couple of template repositories that you can use as a starting place:

  • If you want to use allennlp train and config files to specify experiments, use this template. We recommend this approach.
  • If you'd prefer to use python code to configure your experiments and run your training loop, use this template. There are a few things that are currently a little harder in this setup (loading a saved model, and using distributed training), but otherwise it's functionality equivalent to the config files setup.

In addition, there are external tutorials:

And others on the AI2 AllenNLP blog.

Plugins

AllenNLP supports loading "plugins" dynamically. A plugin is just a Python package that provides custom registered classes or additional allennlp subcommands.

There is ecosystem of open source plugins, some of which are maintained by the AllenNLP team here at AI2, and some of which are maintained by the broader community.

Plugin Maintainer CLI Description
allennlp-models AI2 No A collection of state-of-the-art models
allennlp-semparse AI2 No A framework for building semantic parsers
allennlp-server AI2 Yes A simple demo server for serving models
allennlp-optuna Makoto Hiramatsu Yes Optuna integration for hyperparameter optimization

AllenNLP will automatically find any official AI2-maintained plugins that you have installed, but for AllenNLP to find personal or third-party plugins you've installed, you also have to create either a local plugins file named .allennlp_plugins in the directory where you run the allennlp command, or a global plugins file at ~/.allennlp/plugins. The file should list the plugin modules that you want to be loaded, one per line.

To test that your plugins can be found and imported by AllenNLP, you can run the allennlp test-install command. Each discovered plugin will be logged to the terminal.

For more information about plugins, see the plugins API docs. And for information on how to create a custom subcommand to distribute as a plugin, see the subcommand API docs.

Package Overview

allennlp An open-source NLP research library, built on PyTorch
allennlp.commands Functionality for the CLI
allennlp.common Utility modules that are used across the library
allennlp.data A data processing module for loading datasets and encoding strings as integers for representation in matrices
allennlp.modules A collection of PyTorch modules for use with text
allennlp.nn Tensor utility functions, such as initializers and activation functions
allennlp.training Functionality for training models

Installation

AllenNLP requires Python 3.6.1 or later and PyTorch. It's recommended that you install the PyTorch ecosystem before installing AllenNLP by following the instructions on pytorch.org.

The preferred way to install AllenNLP is via pip. Just run pip install allennlp.

⚠️ If you're using Python 3.7 or greater, you should ensure that you don't have the PyPI version of dataclasses installed after running the above command, as this could cause issues on certain platforms. You can quickly check this by running pip freeze | grep dataclasses. If you see something like dataclasses=0.6 in the output, then just run pip uninstall -y dataclasses.

If you need pointers on setting up an appropriate Python environment or would like to install AllenNLP using a different method, see below.

We support AllenNLP on Mac and Linux environments. We presently do not support Windows but are open to contributions.

Installing via pip

Setting up a virtual environment

Conda can be used set up a virtual environment with the version of Python required for AllenNLP. If you already have a Python 3 environment you want to use, you can skip to the 'installing via pip' section.

  1. Download and install Conda.

  2. Create a Conda environment with Python 3.7 (3.6 or 3.8 would work as well):

    conda create -n allennlp python=3.7
    
  3. Activate the Conda environment. You will need to activate the Conda environment in each terminal in which you want to use AllenNLP:

    conda activate allennlp
    

Installing the library and dependencies

Installing the library and dependencies is simple using pip.

pip install allennlp

Looking for bleeding edge features? You can install nightly releases directly from pypi

AllenNLP installs a script when you install the python package, so you can run allennlp commands just by typing allennlp into a terminal. For example, you can now test your installation with allennlp test-install.

You may also want to install allennlp-models, which contains the NLP constructs to train and run our officially supported models, many of which are hosted at https://demo.allennlp.org.

pip install allennlp-models

Installing using Docker

Docker provides a virtual machine with everything set up to run AllenNLP-- whether you will leverage a GPU or just run on a CPU. Docker provides more isolation and consistency, and also makes it easy to distribute your environment to a compute cluster.

AllenNLP provides official Docker images with the library and all of its dependencies installed.

Once you have installed Docker, you should also install the NVIDIA Container Toolkit if you have GPUs available.

Then run the following command to get an environment that will run on GPU:

mkdir -p $HOME/.allennlp/
docker run --rm --gpus all -v $HOME/.allennlp:/root/.allennlp allennlp/allennlp:latest

You can test the Docker environment with

docker run --rm --gpus all -v $HOME/.allennlp:/root/.allennlp allennlp/allennlp:latest test-install 

If you don't have GPUs available, just omit the --gpus all flag.

Building your own Docker image

For various reasons you may need to create your own AllenNLP Docker image, such as if you need a different version of PyTorch. To do so, just run make docker-image from the root of your local clone of AllenNLP.

By default this builds an image with the tag allennlp/allennlp, but you can change this to anything you want by setting the DOCKER_TAG flag when you call make. For example, make docker-image DOCKER_TAG=my-allennlp.

If you want to use a different version of PyTorch, set the flag DOCKER_TORCH_VERSION to something like torch==1.7.0 or torch==1.7.0+cu110 -f https://download.pytorch.org/whl/torch_stable.html. The value of this flag will passed directly to pip install.

After building the image you should be able to see it listed by running docker images allennlp.

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
allennlp/allennlp   latest              b66aee6cb593        5 minutes ago       2.38GB

Installing from source

You can also install AllenNLP by cloning our git repository:

git clone https://github.com/allenai/allennlp.git

Create a Python 3.7 or 3.8 virtual environment, and install AllenNLP in editable mode by running:

pip install --editable .
pip install -r dev-requirements.txt

This will make allennlp available on your system but it will use the sources from the local clone you made of the source repository.

You can test your installation with allennlp test-install. See https://github.com/allenai/allennlp-models for instructions on installing allennlp-models from source.

Running AllenNLP

Once you've installed AllenNLP, you can run the command-line interface with the allennlp command (whether you installed from pip or from source). allennlp has various subcommands such as train, evaluate, and predict. To see the full usage information, run allennlp --help.

You can test your installation by running allennlp test-install.

Issues

Everyone is welcome to file issues with either feature requests, bug reports, or general questions. As a small team with our own internal goals, we may ask for contributions if a prompt fix doesn't fit into our roadmap. To keep things tidy we will often close issues we think are answered, but don't hesitate to follow up if further discussion is needed.

Contributions

The AllenNLP team at AI2 (@allenai) welcomes contributions from the greater AllenNLP community, and, if you would like to get a change into the library, this is likely the fastest approach. If you would like to contribute a larger feature, we recommend first creating an issue with a proposed design for discussion. This will prevent you from spending significant time on an implementation which has a technical limitation someone could have pointed out early on. Small contributions can be made directly in a pull request.

Pull requests (PRs) must have one approving review and no requested changes before they are merged. As AllenNLP is primarily driven by AI2 (@allenai) we reserve the right to reject or revert contributions that we don't think are good additions.

Citing

If you use AllenNLP in your research, please cite AllenNLP: A Deep Semantic Natural Language Processing Platform.

@inproceedings{Gardner2017AllenNLP,
  title={AllenNLP: A Deep Semantic Natural Language Processing Platform},
  author={Matt Gardner and Joel Grus and Mark Neumann and Oyvind Tafjord
    and Pradeep Dasigi and Nelson F. Liu and Matthew Peters and
    Michael Schmitz and Luke S. Zettlemoyer},
  year={2017},
  Eprint = {arXiv:1803.07640},
}

Team

AllenNLP is an open-source project backed by the Allen Institute for Artificial Intelligence (AI2). AI2 is a non-profit institute with the mission to contribute to humanity through high-impact AI research and engineering. To learn more about who specifically contributed to this codebase, see our contributors page.

Issues
  • Add support for pretrained embedding extension in fine-tuning.

    Add support for pretrained embedding extension in fine-tuning.

    @matt-gardner , I am working on the final piece (follow up to #2387) - if pretrained file was used in the Embedding construction (training), use the same for extension in fine-tuning. This is rough yet, and seems kind-of hacky. But if high level approach seems reasonable, I can refactor things.

    Notes:

    For this to work, we need to access the pretrained-embedding-file used by Embedding during training, in the Embedding during fine-tunng.

    • First, for embedding params having pretrained_file, I make it default to add it in files_to_archive dict. This is necessary for the files to be available at the time of fine-tuning.
    • Second, during the training, I store the used pretrained_file as an attribute in the instance (similar to what we did with vocab_namespace in previous PR).

    However, this isn't enough because file named _pretrained_file won't be available at same location at time of fine-tuning, instead would be in serialization dir fta as saved by archive_model function. So to fix this, I make a mapping of orginal filename (used during training) to replacement filename (available in serialization fta during fine-tuning) and allow it to pass in extend_vocab calls.

    opened by HarshTrivedi 53
  • Consider removing CallbackTrainer

    Consider removing CallbackTrainer

    The callback trainer hasn't really worked, because callbacks that we've tried to add have required setting state on the CallbackTrainer itself, which makes them hard to add. Given this, we are just maintaining 2 Trainers unnecessarily, which slows us down.

    Happy to hear reasons why we should keep the callback trainer/ if people have found it particularly useful!

    opened by DeNeutoy 44
  • add BERT token embedder

    add BERT token embedder

    this is ready for review. in addition to the included unit tests, I trained two NER models using these embeddings (unfortunately, I realized this morning, I used the uncased BERT model, which seems like a bad idea for NER)

    (1) only BERT embeddings: https://beaker-internal.allenai.org/ex/ex_rnk3mcplnpjz/tasks (2) BERT embeddings + character embeddings: https://beaker-internal.allenai.org/ex/ex_nrq8d5vw5cb2/tasks

    (apologies to non-AI2 people for the beaker-internal links)

    as discussed offline, because of the positional encodings the BERT embedding has a max sequence length and will crash if you feed it longer sequences. this implementation simply truncates longer sequences and logs a warning. I left a TODO to come up with something better.

    opened by joelgrus 39
  • Shuffling + bucketing are incompatible with lazy dataset reading

    Shuffling + bucketing are incompatible with lazy dataset reading

    System (please complete the following information):

    • OS: OS X
    • Python version: 3.6.10
    • AllenNLP version: hash 4749fc3
    • PyTorch version: (if you installed it yourself): 1.4.0

    Question Is it possible to shuffle a lazily-read dataset with the new dataloaders?

    Hi!

    I'm working with a dataset that won't fit in main memory, so I'd like to lazily read it. However, it seems like the PyTorch DataLoader (understandably) gets mad at me when I set "lazy": true while still providing a BatchSampler:

    ValueError: DataLoader with IterableDataset: expected unspecified batch_sampler option, but got batch_sampler=<allennlp.data.samplers.$ucket_batch_sampler.BucketBatchSampler object at 0x7f70493ba820>
    

    So I removed the BatchSampler and set "shuffle": true in the data loader, but it also complains that:

    ValueError: DataLoader with IterableDataset: expected unspecified shuffle option, but got shuffle=True
    

    I guess this makes sense, since the loader doesn't get random access to the dataset. Is there any way to lazily read and still shuffle data? I'm not sure if this was always the behavior, or if this is new with the new dataloaders...can someone remind me?

    opened by nelson-liu 36
  • Seq2Seq model decomposition

    Seq2Seq model decomposition

    Hi, this is work-in-progress pull request for my attempt to decompose monolith seq2seq and enable generic decoder module. Initial issue discussion could be found here https://github.com/allenai/allennlp/issues/2097 There is also work notes on this here https://github.com/epwalsh/allennlp/pull/3 I am looking for some feedback on my changes. One of the questions is how should we support backward compatibility since module parameters changed.

    opened by generall 36
  • Filter Warnings when pytest

    Filter Warnings when pytest

    fix #1672

    By default pytest will display some warnings from user code and third-party libraries, as recommended by PEP-0506. This helps users keep their code modern and avoid breakages when deprecated warnings are effectively removed.

    from pytest.

    There still remain some warnings:

    /usr/local/lib/python3.6/site-packages/nbconvert/exporters/exporter_locator.py:28
    [02:56:20][Step 3/10]   /usr/local/lib/python3.6/site-packages/nbconvert/exporters/exporter_locator.py:28: DeprecationWarning: `nbconvert.exporters.exporter_locator` is deprecated in favor of `nbconvert.exporters.base` since nbconvert 5.0.
    [02:56:20][Step 3/10]     DeprecationWarning)
    [02:56:20][Step 3/10] 
    [02:56:20][Step 3/10] /usr/local/lib/python3.6/site-packages/tornado/web.py:1747
    [02:56:20][Step 3/10]   /usr/local/lib/python3.6/site-packages/tornado/web.py:1747: DeprecationWarning: @asynchronous is deprecated, use coroutines instead
    [02:56:20][Step 3/10]     DeprecationWarning)
    [02:56:20][Step 3/10] 
    [02:56:20][Step 3/10] allennlp/data/token_indexers/openai_transformer_byte_pair_indexer.py:25
    [02:56:20][Step 3/10]   /local/deploy/agent3/work/98197cf33cb401e5/allennlp/data/token_indexers/openai_transformer_byte_pair_indexer.py:25: DeprecationWarning: invalid escape sequence \?
    [02:56:20][Step 3/10]     text = re.sub('''(-+|~+|!+|"+|;+|\?+|\++|,+|\)+|\(+|\\+|\/+|\*+|\[+|\]+|}+|{+|\|+|_+)''', r' \1 ', text)
    ...
    [02:56:20][Step 3/10] allennlp/tests/predictors/srl_test.py::TestSrlPredictor::test_uses_named_inputs
    [02:56:20][Step 3/10]   /usr/local/bin/pytest:11: DeprecationWarning: [W002] Tokenizer.from_list is now deprecated. Create a new Doc object instead and pass in the strings as the `words` keyword argument, for example:
    [02:56:20][Step 3/10]   from spacy.tokens import Doc
    [02:56:20][Step 3/10]   doc = Doc(nlp.vocab, words=[...])
    [02:56:20][Step 3/10]     sys.exit(main())
    ...
    [02:56:20][Step 3/10] allennlp/tests/semparse/worlds/text2sql_world_test.py::TestText2SqlWorld::test_variable_free_world_cannot_parse_as_statements
    [02:56:20][Step 3/10]   <unknown>:1: DeprecationWarning: invalid escape sequence \s
    [02:56:20][Step 3/10]   <unknown>:1: DeprecationWarning: invalid escape sequence \s
    [02:56:20][Step 3/10]   <unknown>:1: DeprecationWarning: invalid escape sequence \d
    

    which is possibly related to

    I think it is better to fix it (using new methods instead of deprecated ones), but I have no idea about DeprecationWarning: nbconvert.exporters.exporter_locator is deprecated in favor of... and /usr/local/lib/python3.6/site-packages/tornado/web.py:1747: DeprecationWarning: @asynchronous is deprecated, use coroutines instead.

    opened by WrRan 34
  • Train a model with transformer embeddings and additional_special_tokens

    Train a model with transformer embeddings and additional_special_tokens

    Checklist

    • [x] I have verified that the issue exists against the master branch of AllenNLP.
    • [x] I have read the relevant section in the contribution guide on reporting bugs.
    • [x] I have checked the issues list for similar or identical bug reports.
    • [x] I have checked the pull requests list for existing proposed fixes.
    • [x] I have checked the CHANGELOG and the commit log to find out if the bug was already fixed in the master branch.
    • [x] I have included in the "Description" section below a traceback from any exceptions related to this bug.
    • [x] I have included in the "Related issues or possible duplicates" section beloew all related issues and possible duplicate issues (If there are none, check this box anyway).
    • [x] I have included in the "Environment" section below the name of the operating system and Python version that I was using when I discovered this bug.
    • [x] I have included in the "Environment" section below the output of pip freeze.
    • [x] I have included in the "Steps to reproduce" section below a minimally reproducible example.

    Description

    Hi there! I'm trying to train a transformer-based text classifier model in AllenNLP, but I need to add 5 additional special tokens, in a way compatible with tokenizers lib. I tried adding them to the jsonnet AllenNLP config file and then to the transformer's model path, but neither worked, with each approach having a different problem, which will be described below.

    Python traceback:

    2020-09-30 23:56:17,398 - INFO - allennlp.training.trainer - Epoch 0/9
    2020-09-30 23:56:17,398 - INFO - allennlp.training.trainer - Worker 0 memory usage MB: 10065.304
    2020-09-30 23:56:17,484 - WARNING - allennlp.common.util - unable to check gpu_memory_mb() due to occasional failure, continuing
    Traceback (most recent call last):
      File "/media/discoD/repositorios/allennlp/allennlp/common/util.py", line 415, in gpu_memory_mb
        encoding="utf-8",
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/subprocess.py", line 411, in check_output
        **kwargs).stdout
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/subprocess.py", line 488, in run
        with Popen(*popenargs, **kwargs) as process:
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/subprocess.py", line 800, in __init__
        restore_signals, start_new_session)
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/subprocess.py", line 1482, in _execute_child
        restore_signals, start_new_session, preexec_fn)
      File "/media/discoD/pycharm-community-2019.2/plugins/python-ce/helpers/pydev/_pydev_bundle/pydev_monkey.py", line 526, in new_fork_exec
        return getattr(_posixsubprocess, original_name)(args, *patch_fork_exec_executable_list(args, other_args))
    OSError: [Errno 12] Cannot allocate memory
    2020-09-30 23:56:17,489 - INFO - allennlp.training.trainer - Training
      0%|          | 0/11817 [00:00<?, ?it/s]/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [69,0,0], thread: [32,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
    /pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [69,0,0], thread: [33,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
    /pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [69,0,0], thread: [34,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
    /pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [69,0,0], thread: [35,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
    ...
    ...
    ...
    /pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [102,0,0], thread: [30,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
    /pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [102,0,0], thread: [31,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
      0%|          | 0/11817 [00:00<?, ?it/s]
    Traceback (most recent call last):
      File "/media/discoD/repositorios/allennlp/allennlp/commands/train.py", line 443, in _train_worker
        metrics = train_loop.run()
      File "/media/discoD/repositorios/allennlp/allennlp/commands/train.py", line 505, in run
        return self.trainer.train()
      File "/media/discoD/repositorios/allennlp/allennlp/training/trainer.py", line 872, in train
        train_metrics = self._train_epoch(epoch)
      File "/media/discoD/repositorios/allennlp/allennlp/training/trainer.py", line 594, in _train_epoch
        batch_outputs = self.batch_outputs(batch, for_training=True)
      File "/media/discoD/repositorios/allennlp/allennlp/training/trainer.py", line 479, in batch_outputs
        output_dict = self._pytorch_model(**batch)
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/media/discoD/repositorios/allennlp/allennlp/models/basic_classifier.py", line 121, in forward
        embedded_text = self._text_field_embedder(tokens)
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/media/discoD/repositorios/allennlp/allennlp/modules/text_field_embedders/basic_text_field_embedder.py", line 88, in forward
        token_vectors = embedder(**tensors, **forward_params_values)
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/media/discoD/repositorios/allennlp/allennlp/modules/token_embedders/pretrained_transformer_embedder.py", line 184, in forward
        transformer_output = self.transformer_model(**parameters)
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/transformers/modeling_bert.py", line 762, in forward
        output_hidden_states=output_hidden_states,
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/transformers/modeling_bert.py", line 439, in forward
        output_attentions,
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/transformers/modeling_bert.py", line 371, in forward
        hidden_states, attention_mask, head_mask, output_attentions=output_attentions,
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/transformers/modeling_bert.py", line 315, in forward
        hidden_states, attention_mask, head_mask, encoder_hidden_states, encoder_attention_mask, output_attentions,
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/transformers/modeling_bert.py", line 221, in forward
        mixed_query_layer = self.query(hidden_states)
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
        result = self.forward(*input, **kwargs)
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 91, in forward
        return F.linear(input, self.weight, self.bias)
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/site-packages/torch/nn/functional.py", line 1676, in linear
        output = input.matmul(weight.t())
    RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`
    python-BaseException
    THCudaCheck FAIL file=/pytorch/aten/src/THC/THCCachingHostAllocator.cpp line=278 error=710 : device-side assert triggered
    

    Related issues or possible duplicates

    • None

    Environment

    OS: Linux

    Python version: 3.7.7

    Output of pip freeze:

    allennlp==1.1.0
    allennlp-models==1.1.0
    -e [email protected]:allenai/[email protected]#egg=allennlp_server
    attrs==19.3.0
    backcall==0.2.0
    bleach==3.1.5
    blis==0.4.1
    boto3==1.14.31
    botocore==1.17.31
    cachetools==4.1.1
    catalogue==1.0.0
    certifi==2020.6.20
    chardet==3.0.4
    click==7.1.2
    conllu==4.1
    cycler==0.10.0
    cymem==2.0.3
    cytoolz==0.10.1
    decorator==4.4.2
    defusedxml==0.6.0
    docutils==0.15.2
    eland==7.7.0a1
    elasticsearch-dsl==7.2.1
    en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.3.1/en_core_web_sm-2.3.1.tar.gz
    entrypoints==0.3
    filelock==3.0.12
    fire==0.3.1
    Flask==1.1.2
    Flask-Cors==3.0.8
    ftfy==5.8
    future==0.18.2
    gevent==20.6.2
    greenlet==0.4.16
    h5py==2.10.0
    idna==2.10
    importlib-metadata==1.7.0
    iniconfig==1.0.1
    ipykernel==5.3.4
    ipython==7.16.1
    ipython-genutils==0.2.0
    ipywidgets==7.5.1
    itsdangerous==1.1.0
    jedi==0.17.2
    jellyfish==0.8.2
    Jinja2==2.11.2
    jmespath==0.10.0
    joblib==0.16.0
    jsonnet==0.16.0
    jsonpickle==1.4.1
    jsonschema==3.2.0
    jupyter-client==6.1.6
    jupyter-core==4.6.3
    Keras==2.4.3
    kiwisolver==1.2.0
    MarkupSafe==1.1.1
    matplotlib==3.3.0
    mistune==0.8.4
    mkl-fft==1.1.0
    mkl-random==1.1.1
    mkl-service==2.3.0
    more-itertools==8.4.0
    murmurhash==1.0.2
    nbconvert==5.6.1
    nbformat==5.0.7
    networkx==2.4
    nltk==3.5
    notebook==6.0.3
    numpy==1.18.5
    olefile==0.46
    overrides==3.1.0
    packaging==20.4
    pandas==1.1.0
    pandocfilters==1.4.2
    parso==0.7.1
    pexpect==4.8.0
    pickleshare==0.7.5
    Pillow==7.2.0
    plac==1.1.3
    pluggy==0.13.1
    preshed==3.0.2
    prometheus-client==0.8.0
    prompt-toolkit==3.0.5
    protobuf==3.12.4
    ptyprocess==0.6.0
    py==1.9.0
    py-rouge==1.1
    pydot==1.4.1
    pyemd==0.5.1
    Pygments==2.6.1
    pyparsing==2.4.7
    Pyphen==0.9.5
    pyrsistent==0.16.0
    pytest==6.0.1
    python-dateutil==2.8.1
    pytz==2020.1
    PyYAML==5.3.1
    pyzmq==19.0.1
    regex==2020.7.14
    requests==2.24.0
    s3transfer==0.3.3
    sacremoses==0.0.43
    scikit-learn==0.23.1
    scipy==1.5.2
    seaborn==0.11.0
    Send2Trash==1.5.0
    sentencepiece==0.1.91
    seqeval==0.0.12
    six==1.15.0
    spacy==2.3.2
    srsly==1.0.2
    tensorboardX==2.1
    termcolor==1.1.0
    terminado==0.8.3
    testpath==0.4.4
    thinc==7.4.1
    threadpoolctl==2.1.0
    tokenizers==0.8.1rc1
    toml==0.10.1
    toolz==0.10.0
    torch==1.6.0+cu101
    torchvision==0.7.0+cu101
    tornado==6.0.4
    tqdm==4.48.0
    traitlets==4.3.3
    transformers==3.0.2
    urllib3==1.25.10
    visualise-spacy-tree==0.0.6
    wasabi==0.7.1
    wcwidth==0.2.5
    webencodings==0.5.1
    Werkzeug==1.0.1
    widgetsnbextension==3.5.1
    word2number==1.1
    zipp==3.1.0
    zope.event==4.4
    zope.interface==5.1.0
    

    Steps to reproduce

    First I tried adding the 5 additional special tokens directly in the jsonnet model config, like this:

        "token_indexers": {
                "tokens": {
                    "type": "pretrained_transformer",
                    "model_name": transformer_model,
                    "max_length": transformer_dim,
                    "tokenizer_kwargs": {"additional_special_tokens": [['<REL_SEP>'], ['[['], [']]'], ['<<'], ['>>']], "max_len": transformer_dim}
                }
         },
    

    But I ran into a problem at allennlp.common.cached_transformer.get_tokenizer, because cache_key = (model_name, frozenset(kwargs.items())) tries to use the "tokenizer_kwargs" value as a cache key, but it can't parse the additional_special_tokens list into a string, throwing the following exception:

    TypeError: unhashable type: 'list'

    Traceback (most recent call last):
      File "/media/discoD/pycharm-community-2019.2/plugins/python-ce/helpers/pydev/pydevd.py", line 1465, in _exec
        runpy._run_module_as_main(module_name, alter_argv=False)
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/runpy.py", line 193, in _run_module_as_main
        "__main__", mod_spec)
      File "/media/discoD/anaconda3/envs/allennlp/lib/python3.7/runpy.py", line 85, in _run_code
        exec(code, run_globals)
      File "/media/discoD/repositorios/allennlp/allennlp/__main__.py", line 38, in <module>
        run()
      File "/media/discoD/repositorios/allennlp/allennlp/__main__.py", line 34, in run
        main(prog="allennlp")
      File "/media/discoD/repositorios/allennlp/allennlp/commands/__init__.py", line 94, in main
        args.func(args)
      File "/media/discoD/repositorios/allennlp/allennlp/commands/train.py", line 118, in train_model_from_args
        file_friendly_logging=args.file_friendly_logging,
      File "/media/discoD/repositorios/allennlp/allennlp/commands/train.py", line 177, in train_model_from_file
        file_friendly_logging=file_friendly_logging,
      File "/media/discoD/repositorios/allennlp/allennlp/commands/train.py", line 238, in train_model
        file_friendly_logging=file_friendly_logging,
      File "/media/discoD/repositorios/allennlp/allennlp/commands/train.py", line 433, in _train_worker
        local_rank=process_rank,
      File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 599, in from_params
        **extras,
      File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 626, in from_params
        kwargs = create_kwargs(constructor_to_inspect, cls, params, **extras)
      File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 197, in create_kwargs
        cls.__name__, param_name, annotation, param.default, params, **extras
      File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 306, in pop_and_construct_arg
        return construct_arg(class_name, name, popped_params, annotation, default, **extras)
      File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 340, in construct_arg
        return annotation.from_params(params=popped_params, **subextras)
      File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 599, in from_params
        **extras,
      File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 626, in from_params
        kwargs = create_kwargs(constructor_to_inspect, cls, params, **extras)
      File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 197, in create_kwargs
        cls.__name__, param_name, annotation, param.default, params, **extras
      File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 306, in pop_and_construct_arg
        return construct_arg(class_name, name, popped_params, annotation, default, **extras)
      File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 387, in construct_arg
        **extras,
      File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 340, in construct_arg
        return annotation.from_params(params=popped_params, **subextras)
      File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 599, in from_params
        **extras,
      File "/media/discoD/repositorios/allennlp/allennlp/common/from_params.py", line 628, in from_params
        return constructor_to_call(**kwargs)  # type: ignore
      File "/media/discoD/repositorios/allennlp/allennlp/data/token_indexers/pretrained_transformer_indexer.py", line 58, in __init__
        model_name, tokenizer_kwargs=tokenizer_kwargs
      File "/media/discoD/repositorios/allennlp/allennlp/data/tokenizers/pretrained_transformer_tokenizer.py", line 71, in __init__
        model_name, add_special_tokens=False, **tokenizer_kwargs
      File "/media/discoD/repositorios/allennlp/allennlp/common/cached_transformers.py", line 101, in get_tokenizer
        cache_key = (model_name, frozenset(kwargs.items()))
    TypeError: unhashable type: 'list'
    

    I couldn't find a way to work passing the tokens in this way, so I ended up downloading the bert model to my local disk and added the tokenizers config files to the same path (the vocab size of my bert model is 29794, so the last index is 29793). Files contents I changed are in the "Example source" section below.

    After debugging, looks like this config at least was enough to get the bert tokenizer to recognize the 5 tokens and tokenize the training data accordingly, but then I ran into another issue once training actually began (the one pasted in the "Python traceback" section of this issue).

    Looks like this error is due to the fact that the transformer's model embeddings layer weren't properly resized according to the new vocabulary size, which would be accomplished with a code like this: model.resize_token_embeddings(len(tokenizer)). I didn't find any code in the AllenNLP lib that would do something like this, so I'm thinking this is the issue's cause.

    Is there another way to accomplish this using AllenNLP that I'm not aware of? Looks like both ways to expand the vocab size should be possible.

    Example source:

    added_tokens.json:

    {"<REL_SEP>": 29794, "[[": 29795, "]]": 29796, "<<": 29797, ">>": 29798}

    special_tokens_map.json:

    {"unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]", "additional_special_tokens": ["<REL_SEP>", "[[", "]]", "<<", ">>"]}

    tokenizer_config.json:

    {"do_lower_case": false, "additional_special_tokens": ["<REL_SEP>", "[[", "]]", "<<", ">>"]}

    Thanks!

    bug Contributions welcome 
    opened by pvcastro 32
  • [Contribution] DeepSpeed Integration

    [Contribution] DeepSpeed Integration

    DeepSpeed background

    DeepSpeed is a distributed training engine for PyTorch, primarily for training very large language models with significantly less memory. For example, the 17.7 billion parameter Turing-NLG was trained with DeepSpeed's ZeRO optimizer.

    Proposal

    It seems like a natural fit to have a way to use this with AllenNLP for use with large, distributed experiments. It also shouldn't require any major changes to integrate. Their training loop looks like:

    # https://www.deepspeed.ai/getting-started/#training
    model_engine, optimizer, _, _ = deepspeed.initialize(args=cmd_args,
                                                         model=model,
                                                         model_parameters=params)
    for step, batch in enumerate(data_loader):
        #forward() method
        loss = model_engine(batch)
    
        #runs backpropagation
        model_engine.backward(loss)
    
        #weight update
        model_engine.step()
    

    In terms of where it would fit into the library, I think a standalone DeepSpeedTrainer(Trainer) subclass would make sense. It should be fairly similar to GradientDescentTrainer (minus stuff that DeepSpeed handles itself, like gradient accumulation). It could then be initialized from a config file by the user as per usual.

    I know not having dependencies on other libraries is a point of emphasis. It should be possible to include without adding deepspeed as a dependency and allowing the user to install it independently by doing something like:

    # allennlp.training.__init__.py
    # ...
    try:
      from allennlp.training.deepspeed_trainer import DeepSpeedTrainer
    except ImportError:
      pass # maybe a warning here or something
    

    Initial results

    I was able to get a prototype up and running pretty easily. I didn't subclass GradientDescentTrainer (I had a lot of trouble doing that, for whatever reason), but I just copied and pasted the code and started ripping stuff out as I went.

    I setup a training experiment for a basic classifier on the first 10k instances of SST using RoBERTA-base across two GPUs. The GradientDescentTrainer completed an epoch in 20.40s, using 8936MB / 10202MB of GPU memory. The DeepSpeedTrainer prototype completed an epoch in 46.91s, using just 4184MB / 4348MB of GPU memory (less than half!). I don't know why it took so much longer but I strongly assume it's something I implemented wrong myself.

    The repo for this prototype is here.

    Potential obstacles

    • I have plenty of time to implement this if would be a useful addition, but I have little to no idea what I'm doing, I'm not particularly experienced with heavy distributed training stuff.
    • With all due respect, their library could maybe be a bit better documented and is seriously challenging to install and get everything compiled just right.
      • That said, a lot of the latter point might be a product of my setup. I'm working on SLURM instead of a personal VM which makes using their Docker image or installing from source harder than it really is.

    Next steps

    I think this could be a useful addition if (1) it's really halving GPU memory for transformer models and (2) it can be implemented non-intrusively. If you guys agree, I can move my prototype code from my repository into an actual PR.

    Feature request 
    opened by jacobdanovitch 32
  • Installing allennlp through pip using conda virtual environment fails

    Installing allennlp through pip using conda virtual environment fails

    Describe the bug

    I am trying to install allennlp through pip under conda virtual environment, however it fails and leave error message like this:

    Building wheels for collected packages: overrides, jsonnet, nltk, parsimonious, numpydoc, msgpack, regex, ujson, dill, jsondiff, PyYAML, wrapt, cytoolz, future, toolz
      Running setup.py bdist_wheel for overrides ... done
      Stored in directory: /home/ichn/.cache/pip/wheels/f7/27/b8/b4f46c59426a11e7f2d4e472b870ec14c21b4beab2e1afa725
      Running setup.py bdist_wheel for jsonnet ... error
      Complete output from command /home/ichn/anaconda3/envs/torch/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-grk1qblh/jsonnet/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-t66ososi --python-tag cp37:
      running bdist_wheel
      running build
      running build_ext
      g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 core/desugarer.cpp -o core/desugarer.o
      core/desugarer.cpp: In member function ‘void Desugarer::desugar(AST*&, unsigned int)’:
      core/desugarer.cpp:612:51: warning: this statement may fall through [-Wimplicit-fallthrough=]
                       case BOP_MANIFEST_UNEQUAL: invert = true;
                                                  ~~~~~~~^~~~~~
      core/desugarer.cpp:613:17: note: here
                       case BOP_MANIFEST_EQUAL: {
                       ^~~~
      g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 core/formatter.cpp -o core/formatter.o
      g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 core/libjsonnet.cpp -o core/libjsonnet.o
      g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 core/lexer.cpp -o core/lexer.o
      g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 core/parser.cpp -o core/parser.o
      g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 core/pass.cpp -o core/pass.o
      g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 core/static_analysis.cpp -o core/static_analysis.o
      g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 core/string_utils.cpp -o core/string_utils.o
      g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 core/vm.cpp -o core/vm.o
      g++ -c -g -O3 -Wall -Wextra -Woverloaded-virtual -pedantic -std=c++0x -fPIC -Iinclude -Ithird_party/md5 third_party/md5/md5.cpp -o third_party/md5/md5.o
      building '_jsonnet' extension
      creating build
      creating build/temp.linux-x86_64-3.7
      creating build/temp.linux-x86_64-3.7/python
      gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Iinclude -Ithird_party/md5 -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c python/_jsonnet.c -o build/temp.linux-x86_64-3.7/python/_jsonnet.o
      python/_jsonnet.c: In function ‘cpython_native_callback’:
      python/_jsonnet.c:147:19: warning: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘const long unsigned int’} [-Wsign-compare]
           for (i = 0; i < ctx->argc; ++i) {
                         ^
      creating build/lib.linux-x86_64-3.7
      g++ -pthread -shared -B /home/ichn/anaconda3/envs/torch/compiler_compat -L/home/ichn/anaconda3/envs/torch/lib -Wl,-rpath=/home/ichn/anaconda3/envs/torch/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/python/_jsonnet.o core/desugarer.o core/formatter.o core/libjsonnet.o core/lexer.o core/parser.o core/pass.o core/static_analysis.o core/string_utils.o core/vm.o third_party/md5/md5.o -o build/lib.linux-x86_64-3.7/_jsonnet.cpython-37m-x86_64-linux-gnu.so
      /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/python/_jsonnet.o: unable to initialize decompress status for section .debug_info
      /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/python/_jsonnet.o: unable to initialize decompress status for section .debug_info
      /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/python/_jsonnet.o: unable to initialize decompress status for section .debug_info
      /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/python/_jsonnet.o: unable to initialize decompress status for section .debug_info
      build/temp.linux-x86_64-3.7/python/_jsonnet.o: file not recognized: file format not recognized
      collect2: error: ld returned 1 exit status
      error: command 'g++' failed with exit status 1
      
      ----------------------------------------
      Failed building wheel for jsonnet
      Running setup.py clean for jsonnet
      Running setup.py bdist_wheel for nltk ... done
      Stored in directory: /home/ichn/.cache/pip/wheels/f1/98/72/c2ba4734bc46df30b9c3bd3eb037c52ab8ae0110f8fa15200a
      Running setup.py bdist_wheel for parsimonious ... done
      Stored in directory: /home/ichn/.cache/pip/wheels/f1/a4/4b/7cac60fa74b7c16017cd9c67ab65736d3d9318064ae65e0ee0
      Running setup.py bdist_wheel for numpydoc ... done
      Stored in directory: /home/ichn/.cache/pip/wheels/11/76/d4/16c19c2378616c3389916bc6d7b1134b72bfe6f7abd9f80243
      Running setup.py bdist_wheel for msgpack ... done
      Stored in directory: /home/ichn/.cache/pip/wheels/3f/78/5a/92a8797deabe61189baf597a855e9529f6b20a391d9924d968
      Running setup.py bdist_wheel for regex ... error
      Complete output from command /home/ichn/anaconda3/envs/torch/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-grk1qblh/regex/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-6v75m_hn --python-tag cp37:
      /home/ichn/anaconda3/envs/torch/lib/python3.7/site-packages/setuptools/dist.py:470: UserWarning: Normalizing '2018.01.10' to '2018.1.10'
        normalized_version,
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.linux-x86_64-3.7
      copying regex_3/regex.py -> build/lib.linux-x86_64-3.7
      copying regex_3/_regex_core.py -> build/lib.linux-x86_64-3.7
      copying regex_3/test_regex.py -> build/lib.linux-x86_64-3.7
      running build_ext
      building '_regex' extension
      creating build/temp.linux-x86_64-3.7
      creating build/temp.linux-x86_64-3.7/regex_3
      gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c regex_3/_regex.c -o build/temp.linux-x86_64-3.7/regex_3/_regex.o
      gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c regex_3/_regex_unicode.c -o build/temp.linux-x86_64-3.7/regex_3/_regex_unicode.o
      gcc -pthread -shared -B /home/ichn/anaconda3/envs/torch/compiler_compat -L/home/ichn/anaconda3/envs/torch/lib -Wl,-rpath=/home/ichn/anaconda3/envs/torch/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/regex_3/_regex.o build/temp.linux-x86_64-3.7/regex_3/_regex_unicode.o -o build/lib.linux-x86_64-3.7/_regex.cpython-37m-x86_64-linux-gnu.so
      /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/regex_3/_regex.o: unable to initialize decompress status for section .debug_info
      /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/regex_3/_regex.o: unable to initialize decompress status for section .debug_info
      /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/regex_3/_regex.o: unable to initialize decompress status for section .debug_info
      /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/regex_3/_regex.o: unable to initialize decompress status for section .debug_info
      build/temp.linux-x86_64-3.7/regex_3/_regex.o: file not recognized: file format not recognized
      collect2: error: ld returned 1 exit status
      error: command 'gcc' failed with exit status 1
      
      ----------------------------------------
      Failed building wheel for regex
      Running setup.py clean for regex
      Running setup.py bdist_wheel for ujson ... error
      Complete output from command /home/ichn/anaconda3/envs/torch/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-grk1qblh/ujson/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-97buh6vk --python-tag cp37:
      Warning: 'classifiers' should be a list, got type 'filter'
      running bdist_wheel
      running build
      running build_ext
      building 'ujson' extension
      creating build
      creating build/temp.linux-x86_64-3.7
      creating build/temp.linux-x86_64-3.7/python
      creating build/temp.linux-x86_64-3.7/lib
      gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I./python -I./lib -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c ./python/ujson.c -o build/temp.linux-x86_64-3.7/./python/ujson.o -D_GNU_SOURCE
      gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I./python -I./lib -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c ./python/objToJSON.c -o build/temp.linux-x86_64-3.7/./python/objToJSON.o -D_GNU_SOURCE
      ./python/objToJSON.c: In function ‘PyUnicodeToUTF8’:
      ./python/objToJSON.c:154:18: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
           char *data = PyUnicode_AsUTF8AndSize(obj, &len);
                        ^~~~~~~~~~~~~~~~~~~~~~~
      gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I./python -I./lib -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c ./python/JSONtoObj.c -o build/temp.linux-x86_64-3.7/./python/JSONtoObj.o -D_GNU_SOURCE
      gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I./python -I./lib -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c ./lib/ultrajsonenc.c -o build/temp.linux-x86_64-3.7/./lib/ultrajsonenc.o -D_GNU_SOURCE
      ./lib/ultrajsonenc.c:156:23: warning: ‘g_hexChars’ is static but used in inline function ‘Buffer_AppendShortHexUnchecked’ which is not static
         *(outputOffset++) = g_hexChars[(value & 0x000f) >> 0];
                             ^~~~~~~~~~
      ./lib/ultrajsonenc.c:155:23: warning: ‘g_hexChars’ is static but used in inline function ‘Buffer_AppendShortHexUnchecked’ which is not static
         *(outputOffset++) = g_hexChars[(value & 0x00f0) >> 4];
                             ^~~~~~~~~~
      ./lib/ultrajsonenc.c:154:23: warning: ‘g_hexChars’ is static but used in inline function ‘Buffer_AppendShortHexUnchecked’ which is not static
         *(outputOffset++) = g_hexChars[(value & 0x0f00) >> 8];
                             ^~~~~~~~~~
      ./lib/ultrajsonenc.c:153:23: warning: ‘g_hexChars’ is static but used in inline function ‘Buffer_AppendShortHexUnchecked’ which is not static
         *(outputOffset++) = g_hexChars[(value & 0xf000) >> 12];
                             ^~~~~~~~~~
      gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I./python -I./lib -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c ./lib/ultrajsondec.c -o build/temp.linux-x86_64-3.7/./lib/ultrajsondec.o -D_GNU_SOURCE
      creating build/lib.linux-x86_64-3.7
      gcc -pthread -shared -B /home/ichn/anaconda3/envs/torch/compiler_compat -L/home/ichn/anaconda3/envs/torch/lib -Wl,-rpath=/home/ichn/anaconda3/envs/torch/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/./python/ujson.o build/temp.linux-x86_64-3.7/./python/objToJSON.o build/temp.linux-x86_64-3.7/./python/JSONtoObj.o build/temp.linux-x86_64-3.7/./lib/ultrajsonenc.o build/temp.linux-x86_64-3.7/./lib/ultrajsondec.o -o build/lib.linux-x86_64-3.7/ujson.cpython-37m-x86_64-linux-gnu.so
      /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/./python/ujson.o: unable to initialize decompress status for section .debug_info
      /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/./python/ujson.o: unable to initialize decompress status for section .debug_info
      /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/./python/ujson.o: unable to initialize decompress status for section .debug_info
      /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/./python/ujson.o: unable to initialize decompress status for section .debug_info
      build/temp.linux-x86_64-3.7/./python/ujson.o: file not recognized: file format not recognized
      collect2: error: ld returned 1 exit status
      error: command 'gcc' failed with exit status 1
      
      ----------------------------------------
      Failed building wheel for ujson
      Running setup.py clean for ujson
      Running setup.py bdist_wheel for dill ... done
      Stored in directory: /home/ichn/.cache/pip/wheels/f6/d1/a7/c90dbb9c5613295c70d96d60e78c3e2b283143fddbbd57e14d
      Running setup.py bdist_wheel for jsondiff ... done
      Stored in directory: /home/ichn/.cache/pip/wheels/8f/c9/36/f9e8aea16af567ce91abbe6b8b6b650877b9e17ce8aa97fb42
      Running setup.py bdist_wheel for PyYAML ... done
      Stored in directory: /home/ichn/.cache/pip/wheels/11/c5/f8/4e054145468ca00fd2ab4a6c20bf7e09ec57b879572c865ee6
      Running setup.py bdist_wheel for wrapt ... done
      Stored in directory: /home/ichn/.cache/pip/wheels/10/a6/59/eab55ff1e60d10ca0404baf6e7b8baf52908091133608bf289
      Running setup.py bdist_wheel for cytoolz ... error
      Complete output from command /home/ichn/anaconda3/envs/torch/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-grk1qblh/cytoolz/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-ip_zgny_ --python-tag cp37:
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.linux-x86_64-3.7
      creating build/lib.linux-x86_64-3.7/cytoolz
      copying cytoolz/_signatures.py -> build/lib.linux-x86_64-3.7/cytoolz
      copying cytoolz/__init__.py -> build/lib.linux-x86_64-3.7/cytoolz
      copying cytoolz/utils_test.py -> build/lib.linux-x86_64-3.7/cytoolz
      copying cytoolz/_version.py -> build/lib.linux-x86_64-3.7/cytoolz
      copying cytoolz/compatibility.py -> build/lib.linux-x86_64-3.7/cytoolz
      creating build/lib.linux-x86_64-3.7/cytoolz/curried
      copying cytoolz/curried/operator.py -> build/lib.linux-x86_64-3.7/cytoolz/curried
      copying cytoolz/curried/__init__.py -> build/lib.linux-x86_64-3.7/cytoolz/curried
      copying cytoolz/curried/exceptions.py -> build/lib.linux-x86_64-3.7/cytoolz/curried
      copying cytoolz/dicttoolz.pyx -> build/lib.linux-x86_64-3.7/cytoolz
      copying cytoolz/itertoolz.pyx -> build/lib.linux-x86_64-3.7/cytoolz
      copying cytoolz/utils.pyx -> build/lib.linux-x86_64-3.7/cytoolz
      copying cytoolz/recipes.pyx -> build/lib.linux-x86_64-3.7/cytoolz
      copying cytoolz/functoolz.pyx -> build/lib.linux-x86_64-3.7/cytoolz
      copying cytoolz/dicttoolz.pxd -> build/lib.linux-x86_64-3.7/cytoolz
      copying cytoolz/__init__.pxd -> build/lib.linux-x86_64-3.7/cytoolz
      copying cytoolz/recipes.pxd -> build/lib.linux-x86_64-3.7/cytoolz
      copying cytoolz/utils.pxd -> build/lib.linux-x86_64-3.7/cytoolz
      copying cytoolz/functoolz.pxd -> build/lib.linux-x86_64-3.7/cytoolz
      copying cytoolz/itertoolz.pxd -> build/lib.linux-x86_64-3.7/cytoolz
      copying cytoolz/cpython.pxd -> build/lib.linux-x86_64-3.7/cytoolz
      creating build/lib.linux-x86_64-3.7/cytoolz/tests
      copying cytoolz/tests/test_none_safe.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
      copying cytoolz/tests/test_recipes.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
      copying cytoolz/tests/test_curried.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
      copying cytoolz/tests/test_tlz.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
      copying cytoolz/tests/test_itertoolz.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
      copying cytoolz/tests/test_functoolz.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
      copying cytoolz/tests/dev_skip_test.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
      copying cytoolz/tests/test_embedded_sigs.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
      copying cytoolz/tests/test_utils.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
      copying cytoolz/tests/test_docstrings.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
      copying cytoolz/tests/test_inspect_args.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
      copying cytoolz/tests/test_doctests.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
      copying cytoolz/tests/test_curried_toolzlike.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
      copying cytoolz/tests/test_serialization.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
      copying cytoolz/tests/test_compatibility.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
      copying cytoolz/tests/test_signatures.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
      copying cytoolz/tests/test_dev_skip_test.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
      copying cytoolz/tests/test_dicttoolz.py -> build/lib.linux-x86_64-3.7/cytoolz/tests
      running build_ext
      building 'cytoolz.dicttoolz' extension
      creating build/temp.linux-x86_64-3.7
      creating build/temp.linux-x86_64-3.7/cytoolz
      gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c cytoolz/dicttoolz.c -o build/temp.linux-x86_64-3.7/cytoolz/dicttoolz.o
      gcc -pthread -shared -B /home/ichn/anaconda3/envs/torch/compiler_compat -L/home/ichn/anaconda3/envs/torch/lib -Wl,-rpath=/home/ichn/anaconda3/envs/torch/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/cytoolz/dicttoolz.o -o build/lib.linux-x86_64-3.7/cytoolz/dicttoolz.cpython-37m-x86_64-linux-gnu.so
      /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/cytoolz/dicttoolz.o: unable to initialize decompress status for section .debug_info
      /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/cytoolz/dicttoolz.o: unable to initialize decompress status for section .debug_info
      /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/cytoolz/dicttoolz.o: unable to initialize decompress status for section .debug_info
      /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/cytoolz/dicttoolz.o: unable to initialize decompress status for section .debug_info
      build/temp.linux-x86_64-3.7/cytoolz/dicttoolz.o: file not recognized: file format not recognized
      collect2: error: ld returned 1 exit status
      error: command 'gcc' failed with exit status 1
      
      ----------------------------------------
      Failed building wheel for cytoolz
      Running setup.py clean for cytoolz
      Running setup.py bdist_wheel for future ... done
      Stored in directory: /home/ichn/.cache/pip/wheels/3f/66/fe/9c4fd5c707a9f26993ba157f0752d84d5c7e26aedbefb84f76
      Running setup.py bdist_wheel for toolz ... done
      Stored in directory: /home/ichn/.cache/pip/wheels/73/ad/e1/f8fe78eeb9e2b31ea8396419d92adc107c553ff7eb47ad12d9
    Successfully built overrides nltk parsimonious numpydoc msgpack dill jsondiff PyYAML wrapt future toolz
    Failed to build jsonnet regex ujson cytoolz
    Installing collected packages: overrides, jsonnet, wcwidth, ftfy, singledispatch, nltk, regex, murmurhash, ujson, cymem, dill, idna, chardet, urllib3, requests, msgpack, msgpack-numpy, tqdm, wrapt, preshed, toolz, cytoolz, plac, thinc, spacy, sqlparse, itsdangerous, click, werkzeug, MarkupSafe, Jinja2, flask, flask-cors, editdistance, flaky, cycler, pytz, kiwisolver, python-dateutil, pyparsing, matplotlib, greenlet, gevent, atomicwrites, py, pluggy, attrs, more-itertools, pytest, responses, jsonpickle, aws-xray-sdk, cookies, xmltodict, pbr, mock, jsondiff, jmespath, docutils, botocore, s3transfer, boto3, PyYAML, pyaml, boto, websocket-client, docker-pycreds, docker, asn1crypto, cryptography, future, ecdsa, pycryptodome, python-jose, moto, parsimonious, Pygments, alabaster, sphinxcontrib-websupport, imagesize, snowballstemmer, babel, packaging, sphinx, numpydoc, protobuf, tensorboardX, h5py, conllu, scipy, scikit-learn, unidecode, pytorch-pretrained-bert, colorama, pyasn1, rsa, awscli, allennlp
      Running setup.py install for jsonnet ... error
        Complete output from command /home/ichn/anaconda3/envs/torch/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-grk1qblh/jsonnet/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-3ccckdjt/install-record.txt --single-version-externally-managed --compile:
        running install
        running build
        running build_ext
        make: 'core/desugarer.o' is up to date.
        make: 'core/formatter.o' is up to date.
        make: 'core/libjsonnet.o' is up to date.
        make: 'core/lexer.o' is up to date.
        make: 'core/parser.o' is up to date.
        make: 'core/pass.o' is up to date.
        make: 'core/static_analysis.o' is up to date.
        make: 'core/string_utils.o' is up to date.
        make: 'core/vm.o' is up to date.
        make: 'third_party/md5/md5.o' is up to date.
        building '_jsonnet' extension
        creating build
        creating build/temp.linux-x86_64-3.7
        creating build/temp.linux-x86_64-3.7/python
        gcc -pthread -B /home/ichn/anaconda3/envs/torch/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Iinclude -Ithird_party/md5 -I/home/ichn/anaconda3/envs/torch/include/python3.7m -c python/_jsonnet.c -o build/temp.linux-x86_64-3.7/python/_jsonnet.o
        python/_jsonnet.c: In function ‘cpython_native_callback’:
        python/_jsonnet.c:147:19: warning: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘const long unsigned int’} [-Wsign-compare]
             for (i = 0; i < ctx->argc; ++i) {
                           ^
        creating build/lib.linux-x86_64-3.7
        g++ -pthread -shared -B /home/ichn/anaconda3/envs/torch/compiler_compat -L/home/ichn/anaconda3/envs/torch/lib -Wl,-rpath=/home/ichn/anaconda3/envs/torch/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/python/_jsonnet.o core/desugarer.o core/formatter.o core/libjsonnet.o core/lexer.o core/parser.o core/pass.o core/static_analysis.o core/string_utils.o core/vm.o third_party/md5/md5.o -o build/lib.linux-x86_64-3.7/_jsonnet.cpython-37m-x86_64-linux-gnu.so
        /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/python/_jsonnet.o: unable to initialize decompress status for section .debug_info
        /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/python/_jsonnet.o: unable to initialize decompress status for section .debug_info
        /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/python/_jsonnet.o: unable to initialize decompress status for section .debug_info
        /home/ichn/anaconda3/envs/torch/compiler_compat/ld: build/temp.linux-x86_64-3.7/python/_jsonnet.o: unable to initialize decompress status for section .debug_info
        build/temp.linux-x86_64-3.7/python/_jsonnet.o: file not recognized: file format not recognized
        collect2: error: ld returned 1 exit status
        error: command 'g++' failed with exit status 1
        
        ----------------------------------------
    Command "/home/ichn/anaconda3/envs/torch/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-grk1qblh/jsonnet/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-3ccckdjt/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-grk1qblh/jsonnet/
    

    To Reproduce

    I am using conda 5.3.1 under archlinux with pytorch==1.0.0 preinstalled as part of the environment, and gcc of version

    (torch) ➜  ~ g++ --version
    g++ (GCC) 8.2.1 20181127
    Copyright (C) 2018 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    

    also tried install gcc through conda and then rerun

    pip install allennlp

    but fails as well.

    Expected behavior

    allennlp successfully installed

    System (please complete the following information):

    • Linux
    • Python version: 3.7.2
    • AllenNLP version: I installed from master
    • PyTorch version: 1.0.0
    opened by ichn-hu 32
  • Multi-GPU training hangs

    Multi-GPU training hangs

    Checklist

    • [x] I have verified that the issue exists against the main branch of AllenNLP.
    • [x] I have read the relevant section in the contribution guide on reporting bugs.
    • [x] I have checked the issues list for similar or identical bug reports.
    • [x] I have checked the pull requests list for existing proposed fixes.
    • [x] I have checked the CHANGELOG and the commit log to find out if the bug was already fixed in the main branch.
    • [x] I have included in the "Description" section below a traceback from any exceptions related to this bug.
    • [x] I have included in the "Related issues or possible duplicates" section beloew all related issues and possible duplicate issues (If there are none, check this box anyway).
    • [x] I have included in the "Environment" section below the name of the operating system and Python version that I was using when I discovered this bug.
    • [x] I have included in the "Environment" section below the output of pip freeze.
    • [x] I have included in the "Steps to reproduce" section below a minimally reproducible example.

    Description

    I am trying to run multi-GPU training (using 4 GPUs) but it hangs after a few iterations (roughly 15 iterations). This happens both with my custom model as well as with models in allennlp-models (I tried roberta-large).

    Related issues or possible duplicates

    • None

    Environment

    OS: Deep Learning AMI (Ubuntu 18.04) Version 42.1 -- AWS EC2 p3.8xlarge

    Python version: Python 3.8 installed via Anaconda

    Steps to reproduce

    I have installed allennlp-models and changed the configuration file reported above as follows:

    local transformer_model = "roberta-base";
    local transformer_dim = 768;
    
    {
      "dataset_reader":{
        "type": "boolq",
        "token_indexers": {
          "tokens": {
            "type": "pretrained_transformer",
            "model_name": transformer_model,
          }
        },
        "tokenizer": {
          "type": "pretrained_transformer",
          "model_name": transformer_model,
        }
      },
      "train_data_path": "https://storage.googleapis.com/allennlp-public-data/BoolQ.zip!BoolQ/train.jsonl",
      "validation_data_path": "https://storage.googleapis.com/allennlp-public-data/BoolQ.zip!BoolQ/val.jsonl",
      "test_data_path": "https://storage.googleapis.com/allennlp-public-data/BoolQ.zip!BoolQ/test.jsonl",
      "model": {
        "type": "basic_classifier",
        "text_field_embedder": {
          "token_embedders": {
            "tokens": {
              "type": "pretrained_transformer",
              "model_name": transformer_model,
            }
          }
        },
        "seq2vec_encoder": {
           "type": "bert_pooler",
           "pretrained_model": transformer_model,
           "dropout": 0.1,
        },
        "namespace": "tags",
        "num_labels": 2,
      },
      "data_loader": {
        "batch_sampler": {
          "type": "bucket",
          "sorting_keys": ["tokens"],
          "batch_size" : 4
        }
      },
      "distributed": {
          "cuda_devices": [0,1,2,3]
      },
      "trainer": {
        "num_epochs": 10,
        "num_gradient_accumulation_steps": 2,
        "validation_metric": "+accuracy",
        "learning_rate_scheduler": {
          "type": "slanted_triangular",
          "num_epochs": 10,
          "num_steps_per_epoch": 3088,
          "cut_frac": 0.06
        },
        "optimizer": {
          "type": "huggingface_adamw",
          "lr": 1e-5,
          "weight_decay": 0.1,
        }
      },
    }
    

    @epwalsh Any ideas?

    bug 
    opened by aleSuglia 31
  • [Suggestion] Callbacks in Trainer

    [Suggestion] Callbacks in Trainer

    Hello,

    First off, thank you so much for creating and maintaining this amazing library! I'm really loving allennlp so far, but there's one aspect that I think could use some improvements: the Trainer. I think the adoption of Callbacks would make the code much more readable, maintainable, and extendable.

    Is your feature request related to a problem? Please describe.

    • The Trainer code feels bloated and is hard to navigate due to the sheer amount of bookkeeping that is taking place regarding tensorboard, checkpointing, etc.
    • Adding extra behavior to the trainer (e.g. custom logging, adding semi-supervised steps, performing custom sanity checks during training) requires the modification of the training code. It would be better if the user could simply inject this behavior using an external class

    Describe the solution you'd like Frameworks like keras and fast.ai have adopted the Callback pattern to deal with this problem. In fact, checkpointing and tensorboard were both callbacks in keras. Refactoring the trainer to adopt the callback system for a lot of the bookkeeping would make custom training behavior easier to implement.

    Describe alternatives you've considered I can't think of any better alternatives at the moment: the fact that both keras and fast.ai which put heavy emphasis on the end user experience adopt this pattern seem to be a strong indication of its merits. That being said, I would imagine the allennlp team would have considered callbacks at some time, so if there is a reason you are avoiding this pattern, I would love to know why!

    If Callbacks seem like a good addition, I'd love to file a PR on this issue, but since this would be a major refactoring I won't be able to finish it any time soon. Thanks!

    opened by keitakurita 31
  • Unable to `pip install allennlp-models`. Torch version and blis compile issues.

    Unable to `pip install allennlp-models`. Torch version and blis compile issues.

    Checklist

    • [X] I have verified that the issue exists against the main branch of AllenNLP.
    • [X] I have read the relevant section in the contribution guide on reporting bugs.
    • [X] I have checked the issues list for similar or identical bug reports.
    • [X] I have checked the pull requests list for existing proposed fixes.
    • [X] I have checked the CHANGELOG and the commit log to find out if the bug was already fixed in the main branch.
    • [X] I have included in the "Description" section below a traceback from any exceptions related to this bug.
    • [X] I have included in the "Related issues or possible duplicates" section beloew all related issues and possible duplicate issues (If there are none, check this box anyway).
    • [X] I have included in the "Environment" section below the name of the operating system and Python version that I was using when I discovered this bug.
    • [X] I have included in the "Environment" section below the output of pip freeze.
    • [X] I have included in the "Steps to reproduce" section below a minimally reproducible example.

    Description

    After installing Pytorch (1.11) and AllenNLP (2.9.2), via pip in a conda env, I am unable to pip install allennlp-models. I get one of two errors, detailed below.

    Python traceback:

    ## Error 1
    
    ERROR: Cannot install allennlp-models==1.2.1, allennlp-models==1.2.2, allennlp-models==1.3.0, allennlp-models==1.4.0, allennlp-models==1.4.1, allennlp-models==1.5.0, allennlp-models==2.0.0, allennlp-models==2.0.1, allennlp-models==2.1.0, allennlp-models==2.2.0, allennlp-models==2.3.0, allennlp-models==2.4.0, allennlp-models==2.5.0, allennlp-models==2.6.0, allennlp-models==2.7.0, allennlp-models==2.8.0 and allennlp-models==2.9.0 because these package versions have conflicting dependencies.
    
    The conflict is caused by:
        allennlp-models 2.9.0 depends on torch<1.11.0 and >=1.7.0
        allennlp-models 2.8.0 depends on torch<1.11.0 and >=1.7.0
        allennlp-models 2.7.0 depends on torch<1.10.0 and >=1.7.0
        allennlp-models 2.6.0 depends on torch<1.10.0 and >=1.7.0
        allennlp-models 2.5.0 depends on torch<1.9.0 and >=1.7.0
        allennlp-models 2.4.0 depends on torch<1.9.0 and >=1.7.0
        allennlp-models 2.3.0 depends on torch<1.9.0 and >=1.7.0
        allennlp-models 2.2.0 depends on torch<1.9.0 and >=1.7.0
        allennlp-models 2.1.0 depends on torch<1.8.0 and >=1.7.0
        allennlp-models 2.0.1 depends on torch<1.8.0 and >=1.7.0
        allennlp-models 2.0.0 depends on torch<1.8.0 and >=1.7.0
        allennlp-models 1.5.0 depends on torch<1.8.0 and >=1.7.0
        allennlp-models 1.4.1 depends on torch<1.8.0 and >=1.7.0
        allennlp-models 1.4.0 depends on torch<1.8.0 and >=1.7.0
        allennlp-models 1.3.0 depends on torch<1.8.0 and >=1.7.0
        allennlp-models 1.2.2 depends on torch<1.8.0 and >=1.7.0
        allennlp-models 1.2.1 depends on torch<1.8.0 and >=1.7.0
    
    To fix this you could try to:
    1. loosen the range of package versions you've specified
    2. remove package versions to allow pip attempt to solve the dependency conflict
    
    ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts
    
    ## Error 2
    
    Compiler gcc
                building 'blis.cy' extension
                creating build/temp.linux-x86_64-3.10
                creating build/temp.linux-x86_64-3.10/blis
                gcc -pthread -B /home/brochstilley/miniforge3/envs/allennlp-api/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/brochstilley/miniforge3/envs/allennlp-api/include -fPIC -O2 -isystem /home/brochstilley/miniforge3/envs/allennlp-api/include -fPIC -I/tmp/pip-install-_etaqpbt/blis_322118b0f1e54b949f5124bb925aef68/include -I/tmp/pip-install-_etaqpbt/blis_322118b0f1e54b949f5124bb925aef68/blis/_src/include/linux-x86_64 -I/home/brochstilley/miniforge3/envs/allennlp-api/include/python3.10 -c blis/cy.c -o build/temp.linux-x86_64-3.10/blis/cy.o -std=c99
                gcc: error: blis/cy.c: No such file or directory
                gcc: fatal error: no input files
                compilation terminated.
                error: command '/usr/bin/gcc' failed with exit code 1
                [end of output]
          
            note: This error originates from a subprocess, and is likely not a problem with pip.
          error: legacy-install-failure
          
          × Encountered error while trying to install package.
          ╰─> blis
          
          note: This is an issue with the package mentioned above, not pip.
          hint: See above for output from the failure.
          [end of output]
      
      note: This error originates from a subprocess, and is likely not a problem with pip.
    error: subprocess-exited-with-error
    
    × pip subprocess to install build dependencies did not run successfully.
    │ exit code: 1
    ╰─> See above for output.
    
    note: This error originates from a subprocess, and is likely not a problem with pip.
    

    Related issues or possible duplicates

    • None

    Environment

    OS: Linux (KDE Neon (ubuntu))

    Python version: 3.10.4

    Output of pip freeze:

    allennlp==2.9.2
    argon2-cffi==21.3.0
    argon2-cffi-bindings==21.2.0
    asttokens==2.0.5
    attrs==21.4.0
    backcall==0.2.0
    backports.csv==1.0.7
    base58==2.1.1
    beautifulsoup4==4.10.0
    bleach==4.1.0
    blis==0.7.7
    boto3==1.21.33
    botocore==1.24.33
    cached-path==1.1.1
    cachetools==5.0.0
    catalogue==2.0.7
    certifi==2021.10.8
    cffi==1.15.0
    chardet==4.0.0
    charset-normalizer==2.0.12
    checklist==0.0.11
    cheroot==8.6.0
    CherryPy==18.6.1
    click==8.0.4
    cryptography==36.0.2
    cymem==2.0.6
    debugpy==1.6.0
    decorator==5.1.1
    defusedxml==0.7.1
    dill==0.3.4
    docker-pycreds==0.4.0
    entrypoints==0.4
    executing==0.8.3
    fairscale==0.4.6
    fastjsonschema==2.15.3
    feedparser==6.0.8
    filelock==3.6.0
    Flask==2.1.1
    future==0.18.2
    gitdb==4.0.9
    GitPython==3.1.27
    google-api-core==2.7.1
    google-auth==2.6.2
    google-cloud-core==2.2.3
    google-cloud-storage==2.2.1
    google-crc32c==1.3.0
    google-resumable-media==2.3.2
    googleapis-common-protos==1.56.0
    h5py==3.6.0
    huggingface-hub==0.4.0
    idna==3.3
    iniconfig==1.1.1
    ipykernel==6.12.1
    ipython==8.2.0
    ipython-genutils==0.2.0
    ipywidgets==7.7.0
    iso-639==0.4.5
    itsdangerous==2.1.2
    jaraco.classes==3.2.1
    jaraco.collections==3.5.1
    jaraco.context==4.1.1
    jaraco.functools==3.5.0
    jaraco.text==3.7.0
    jedi==0.18.1
    Jinja2==3.1.1
    jmespath==1.0.0
    joblib==1.1.0
    jsonnet==0.18.0
    jsonschema==4.4.0
    jupyter==1.0.0
    jupyter-client==7.2.1
    jupyter-console==6.4.3
    jupyter-core==4.9.2
    jupyterlab-pygments==0.1.2
    jupyterlab-widgets==1.1.0
    langcodes==3.3.0
    lmdb==1.3.0
    lxml==4.8.0
    MarkupSafe==2.1.1
    matplotlib-inline==0.1.3
    mistune==0.8.4
    more-itertools==8.12.0
    munch==2.5.0
    murmurhash==1.0.6
    nbclient==0.5.13
    nbconvert==6.4.5
    nbformat==5.3.0
    nest-asyncio==1.5.5
    nltk==3.7
    notebook==6.4.10
    numpy @ file:///home/conda/feedstock_root/build_artifacts/numpy_1649059883087/work
    packaging==21.3
    pandocfilters==1.5.0
    parso==0.8.3
    pathtools==0.1.2
    pathy==0.6.1
    patternfork-nosql==3.6
    pdfminer.six==20220319
    pexpect==4.8.0
    pickleshare==0.7.5
    Pillow @ file:///home/conda/feedstock_root/build_artifacts/pillow_1648857107578/work
    pluggy==1.0.0
    portend==3.1.0
    preshed==3.0.6
    prometheus-client==0.13.1
    promise==2.3
    prompt-toolkit==3.0.29
    protobuf==3.20.0
    psutil==5.9.0
    ptyprocess==0.7.0
    pure-eval==0.2.2
    py==1.11.0
    pyasn1==0.4.8
    pyasn1-modules==0.2.8
    pycparser==2.21
    pydantic==1.8.2
    Pygments==2.11.2
    pyparsing==3.0.7
    pyrsistent==0.18.1
    pytest==7.1.1
    python-dateutil==2.8.2
    python-docx==0.8.11
    pytz==2022.1
    PyYAML==6.0
    pyzmq==22.3.0
    qtconsole==5.3.0
    QtPy==2.0.1
    regex==2022.3.15
    requests==2.27.1
    rsa==4.8
    s3transfer==0.5.2
    sacremoses==0.0.49
    scikit-learn==1.0.2
    scipy==1.8.0
    Send2Trash==1.8.0
    sentencepiece==0.1.96
    sentry-sdk==1.5.8
    setproctitle==1.2.2
    sgmllib3k==1.0.0
    shortuuid==1.0.8
    six @ file:///home/conda/feedstock_root/build_artifacts/six_1620240208055/work
    smart-open==5.2.1
    smmap==5.0.0
    soupsieve==2.3.1
    spacy==3.2.4
    spacy-legacy==3.0.9
    spacy-loggers==1.0.2
    srsly==2.4.2
    stack-data==0.2.0
    tempora==5.0.1
    tensorboardX==2.5
    termcolor==1.1.0
    terminado==0.13.3
    testpath==0.6.0
    thinc==8.0.15
    threadpoolctl==3.1.0
    tokenizers==0.11.6
    tomli==2.0.1
    torch==1.11.0
    torchaudio==0.11.0
    torchvision==0.12.0
    tornado==6.1
    tqdm==4.63.2
    traitlets==5.1.1
    transformers==4.17.0
    typer==0.4.1
    typing_extensions @ file:///home/conda/feedstock_root/build_artifacts/typing_extensions_1644850595256/work
    urllib3==1.26.9
    wandb==0.12.11
    wasabi==0.9.1
    wcwidth==0.2.5
    webencodings==0.5.1
    Werkzeug==2.1.1
    widgetsnbextension==3.6.0
    yaspin==2.1.0
    zc.lockfile==2.0
    
    

    Steps to reproduce

    Example source:

    conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
    pip install allennlp
    pip install allennlp[all]
    pip install allennlp-models
    

    bug 
    opened by brochington 4
  • FSDP Accelerator auto_wrap ?

    FSDP Accelerator auto_wrap ?

    Currently looking at the discussion here https://github.com/allenai/allennlp/discussions/5433 and the code https://github.com/allenai/allennlp/blob/1caf0dafa3bc8d0bb309a46e2ccb12f714923260/allennlp/nn/parallel/fairscale_fsdp_accelerator.py#L126-L127

    It seems like you have to manually wrap each individual unit of partition.

    Looking at the tutorial for fairscale: https://fairscale.readthedocs.io/en/latest/tutorials/oss.html There is an auto_wrap function that automatically wraps each submodule for you. This is incredibly convenient if you would just like to wrap a huge pretrained transformer embedder yourself.

    Is there a possibility of providing an option to auto_wrap modules?

    Feature request 
    opened by vikigenius 3
  • Don't cache reinit_modules

    Don't cache reinit_modules

    Fixes https://github.com/allenai/allennlp/pull/5505#issuecomment-1007540627

    Changes proposed in this pull request:

    • Don't cache transformers when reinit_modules is provided.
    • Removes reinit_modules from the transformer spec
    • Always load a new model when reinit_modules is not None

    Before submitting

    • [X] I've read and followed all steps in the Making a pull request section of the CONTRIBUTING docs.
    • [X] I've updated or added any relevant docstrings following the syntax described in the Writing docstrings section of the CONTRIBUTING docs.
    • [ ] If this PR fixes a bug, I've added a test that will fail without my fix.
    • [ ] If this PR adds a new feature, I've added tests that sufficiently cover my new functionality.

    After submitting

    • [ ] All GitHub Actions jobs for my pull request have passed.
    • [ ] codecov/patch reports high test coverage (at least 90%). You can find this under the "Actions" tab of the pull request once the other checks have finished.
    opened by JohnGiorgi 4
  • Why not add transformer tokens to vocabulary in init phrase

    Why not add transformer tokens to vocabulary in init phrase

    I met a problem yesterday when I want to get the vocab_size of the vocabulary namespace transformer_tags which I specify in PretrainedTransformerIndexer. I found this namespace doesn't defined when I want to use it in the Model init phrase. To figure out this problem, I read this part of source codes. I find the operation _add_encoding_to_vocabulary_if_needed is called in tokens_to_indices which is called util the _train_epoch starts.

    ...
    class PretrainedTransformerIndexer
    ...
       def _add_encoding_to_vocabulary_if_needed(self, vocab: Vocabulary) -> None:
            """
            Copies tokens from ```transformers``` model's vocab to the specified namespace.
            """
            if self._added_to_vocabulary:
                return
    
            vocab.add_transformer_vocab(self._tokenizer, self._namespace)
    
            self._added_to_vocabulary = True
    
        @overrides
        def count_vocab_items(self, token: Token, counter: Dict[str, Dict[str, int]]):
            # If we only use pretrained models, we don't need to do anything here.
            pass
    
        @overrides
        def tokens_to_indices(self, tokens: List[Token], vocabulary: Vocabulary) -> IndexedTokenList:
            self._add_encoding_to_vocabulary_if_needed(vocabulary)
    ...
    

    It means the transformer_tags namespace I defined in the vocabulary only can be used in the forward process and after it. This causes a probelm that I can't define a linear layer that transforms the model output to vocab_size logits. Of course, I can use from_pretrained_transformer constructor to get a same namespace that I can use in the init phrase. If this is planned, I wonder what's the usage of _add_encoding_to_vocabulary_if_needed in the PretrainedTransformerIndexer. Why not call _add_encoding_to_vocabulary_if_needed in the init method of PretrainedTransformerIndexer so that we can use the specified namespace from the Model init phrase?

    Looking forward to your kind reply ~~ Thanks~~

    Contributions welcome question 
    opened by Zessay 2
  • Add a command to init AllenNLP project like spring-boot-starter

    Add a command to init AllenNLP project like spring-boot-starter

    Is your feature request related to a problem? Please describe. I' a newbie for AllenNLP but I've fall in love with it to handle my own NLP task. But each time I start a new project, I have to copy the previous code of other AllenNLP project since I can't remember the file architecture. Then there will me much old code that is redundant for the new project and I have to delete so much lines that are task-specific. So I need a tool or just a command(e.g. allenlp init) to init a golden template project of AllenNLP, which contains some basic moduel for runing a AllenNLP code, like model, configuration, data processor and demo data folder, etc. I think this feature will make researcher much cheerful to work with AllenNLP.

    Describe the solution you'd like Add a command to init a simplest AllenNLP project, just like spring-boot-start for java web developer.

    Describe alternatives you've considered Or provide a demo code project of AllenNLP and I can easily access for it.

    Additional context No

    Contributions welcome Feature request 
    opened by unikcc 3
  • Add support for transformers LayoutLMv2.

    Add support for transformers LayoutLMv2.

    Is your feature request related to a problem? Please describe. On the current version of 2.7.0 of allennlp and versions 4.11.3 of transformers, layoutlmv2 is not supported :

    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/root/allennlp/allennlp/modules/token_embedders/pretrained_transformer_mismatched_embedder.py", line 80, in __init__
        self._matched_embedder = PretrainedTransformerEmbedder(
      File "/root/allennlp/allennlp/modules/token_embedders/pretrained_transformer_embedder.py", line 123, in __init__
        tokenizer = PretrainedTransformerTokenizer(
      File "/root/allennlp/allennlp/data/tokenizers/pretrained_transformer_tokenizer.py", line 79, in __init__
        self._reverse_engineer_special_tokens("a", "b", model_name, tokenizer_kwargs)
      File "/root/allennlp/allennlp/data/tokenizers/pretrained_transformer_tokenizer.py", line 112, in _reverse_engineer_special_tokens
        dummy_output = tokenizer_with_special_tokens.encode_plus(
      File "/root/anaconda3/envs/alenlayout/lib/python3.8/site-packages/transformers/models/layoutlmv2/tokenization_layoutlmv2_fast.py", line 430, in encode_plus
        return self._encode_plus(
      File "/root/anaconda3/envs/alenlayout/lib/python3.8/site-packages/transformers/models/layoutlmv2/tokenization_layoutlmv2_fast.py", line 639, in _encode_plus
        batched_output = self._batch_encode_plus(
      File "/root/anaconda3/envs/alenlayout/lib/python3.8/site-packages/transformers/models/layoutlmv2/tokenization_layoutlmv2_fast.py", line 493, in _batch_encode_plus
        encodings = self._tokenizer.encode_batch(
    TypeError: PreTokenizedInputSequence must be Union[List[str], Tuple[str]]
    

    Error occurs since they added an argument boxes as second argument of the fast layoutlm_v2 tokenizer which breaks the reverse engineer of the special tokens of allennlp pretrained_transformer_tokenizer.

    Describe the solution you'd like Ideally, naming the arguments in tokenizer_with_special_tokens.encode_plus of pretrained_transformer_tokenizer should do the work but I'm afraid of repercussions on other tokenizer that have different argument name (those not based of Bert maybe?) Moreover since layoutlm_v2 added a few input to the model (images and boxes), modifications should be made to _unfold_long_sequences, _fold_long_sequences and forward of the pretrained_transformer_embedder and pretrained_transformer_mismatched_embedder to account for additional inputs.

    If it's okay with you, I'd like to work in it.

    Contributions welcome Feature request 
    opened by HOZHENWAI 1
Releases(v2.9.2)
  • v2.9.2(Mar 21, 2022)

    What's new

    Fixed ✅

    • Removed unnecessary dependencies
    • Restore functionality of CLI in absence of now-optional checklist-package

    Commits

    f6866f95 Fix CLI and install instructions in case optional checklists is not present (#5589) e1c6935c Update torch requirement from <1.11.0,>=1.6.0 to >=1.6.0,<1.12.0 (#5595) 5f5f8c30 Updated the docs for PytorchSeq2VecWrapper to specify that mask is required (#5386) 2426ce3d Dependencies (#5593) 2d9fe79f Bump fairscale from 0.4.5 to 0.4.6 (#5590) ab37da7b Update transformers requirement from <4.17,>=4.1 to >=4.1,<4.18 (#5583)

    Source code(tar.gz)
    Source code(zip)
  • v2.9.1(Mar 9, 2022)

    What's new

    Fixed ✅

    • Updated dependencies, especially around doc creation.
    • Running the test suite out-of-tree (e.g. after installation) is now possible by pointing the environment variable ALLENNLP_SRC_DIR to the sources.
    • Silenced a warning that happens when you inappropriately clone a tensor.
    • Adding more clarification to the Vocabulary documentation around min_pretrained_embeddings and only_include_pretrained_words.
    • Fixed bug with type mismatch caused by latest release of cached-path that now returns a Path instead of a str.

    Added 🎉

    • We can now transparently read compressed input files during prediction.
    • LZMA compression is now supported.
    • Added a way to give JSON blobs as input to dataset readers in the evaluate command.
    • Added the argument sub_module in PretrainedTransformerMismatchedEmbedder

    Changed ⚠️

    • You can automatically include all words from a pretrained file when building a vocabulary by setting the value in min_pretrained_embeddings to -1 for that particular namespace.

    Commits

    3547bfb8 pin cached-path tighter, make sure our cached-path wrapper still returns str (#5587) 99c93439 Clarify Vocabulary documentation, add -1 option for min_pretrained_embeddings (#5581) 3fa51933 Makes the evaluate command work for the multitask case (Second Edition) (#5579) 9f03803b Add "sub_module" argument in PretrainedTransformerMismatchedEmbedder (#5580) 92e54cce Open Compressed (#5578) 5b3352ce Clone warns (#5575) 9da4b0fe Add Wassterstein Distance calculation option for fairness metrics (#5546) b8f92f03 Update mkdocs-material requirement from <8.2.0,>=5.5.0 to >=5.5.0,<8.3.0 (#5572) a21c0b4c Update filelock requirement from <3.5,>=3.3 to >=3.3,<3.7 (#5571) 6614077b Make tests runnable out-of-tree for help with conda-packaging (#5560) e6792133 Fix CITATION.cff and add automatic validation of your citation metadata (#5561) efa9f1d0 try to unpin nltk (#5563) d01179b7 Small typo fix (#5555) 3c2299aa tighten test_sampled_equals_unsampled_when_biased_against_non_sampled_positions bound (#5549) e463084b Bump black from 21.12b0 to 22.1.0 (#5554) 8226e87d Making checklist optional (#5507) a76bf1e3 Update transformers requirement from <4.16,>=4.1 to >=4.1,<4.17 (#5553)

    Source code(tar.gz)
    Source code(zip)
  • v2.9.0(Jan 27, 2022)

    What's new

    Added 🎉

    • Added an Evaluator class to make comparing source, target, and predictions easier.
    • Added a way to resize the vocabulary in the T5 module
    • Added an argument reinit_modules to cached_transformers.get() that allows you to re-initialize the pretrained weights of a transformer model, using layer indices or regex strings.
    • Added attribute _should_validate_this_epoch to GradientDescentTrainer that controls whether validation is run at the end of each epoch.
    • Added ShouldValidateCallback that can be used to configure the frequency of validation during training.
    • Added a MaxPoolingSpanExtractor. This SpanExtractor represents each span by a component wise max-pooling-operation.

    Fixed ✅

    • Fixed the docstring information for the FBetaMultiLabelMeasure metric.
    • Various fixes for Python 3.9
    • Fixed the name that the push-to-hf command uses to store weights.
    • FBetaMultiLabelMeasure now works with multiple dimensions
    • Support for inferior operating systems when making hardlinks
    • Use , as a separator for filenames in the evaluate command, thus allowing for URLs (eg. gs://...) as input files.
    • Removed a spurious error message "'torch.cuda' has no attribute '_check_driver'" that would be appear in the logs when a ConfigurationError for missing GPU was raised.
    • Load model on CPU post training to save GPU memory.
    • Fixed a bug in ShouldValidateCallback that leads to validation occuring after the first epoch regardless of validation_start value.
    • Fixed a bug in ShouldValidateCallback that leads to validation occuring every validation_interval + 1 epochs, instead of every validation_interval epochs.
    • Fixed a bug in ShouldValidateCallback that leads to validation never occuring at the end of training.

    Removed 👋

    • Removed Tango components, since they now live at https://github.com/allenai/tango.
    • Removed dependency on the overrides package

    Commits

    dd5a010e Evaluator (#5445) 0b54fb0d Bump fairscale from 0.4.4 to 0.4.5 (#5545) 2deacfe5 Fix should validate callback train end (#5542) 2cdb8742 Bump mypy from 0.910 to 0.931 (#5538) a91946ae Keep NLTK down. They broke the download of omw. (#5540) 73a5cfc1 Removes stuff that now lives in the tango repo (#5482) 1278f16d Move changes from #5534 to correct place. (#5535) a7117035 Fix ShouldValidateCallback (#5536) b0b3ad4b Update mkdocs-material requirement from <8.1.0,>=5.5.0 to >=5.5.0,<8.2.0 (#5503) a3d71254 Max out span extractor (#5520) 515fe9b7 Configure validation frequency (#5534) d7e0c877 Update transformers requirement from <4.15,>=4.1 to >=4.1,<4.16 (#5528) 42332476 Bump fairscale from 0.4.3 to 0.4.4 (#5525) 71f2d797 fix 'check_for_gpu' (#5522) 06ec7f9a Reinit layers of pretrained transformer in cached_transformers.get() (#5505) ec1fb69f add missing nltk download in CI (#5529) ab4f7b5c Fix model loading on GPU post training (#5518) 3552842f Fix moving average args not rendering properly in docs (#5516) 87ad0061 Update transformers requirement from <4.13,>=4.1 to >=4.1,<4.15 (#5515) 39f4f4c1 tick version for nightly releases 38436d89 Use comma as filename separator (#5506) e0ee7f43 Dimensions in FBetaMultiLabelMeasure (#5501) d77ba3d6 Hardlink or copy (#5502) dbcbcf10 Add installation instructions through conda-forge (#5498) ebad9eeb Bump black from 21.11b1 to 21.12b0 (#5496) 82b1f4f8 Use the correct filename when uploading models to the HF Hub (#5499) 19f6c8f9 Resize T5 Vocab (#5497) c557d512 enforce reading in utf-8 encoding (#5476) 1caf0daf Removes dependency on the overrides package (#5490) b99376fe Python 3.9 (#5489) 666eaa56 Update mkdocs-material requirement from <7.4.0,>=5.5.0 to >=5.5.0,<8.1.0 (#5486) 64b2c078 Bump fairscale from 0.4.2 to 0.4.3 (#5474) 0a794c6b Fix metric docstring (#5475) f86ff9f4 Bump black from 21.10b0 to 21.11b1 (#5473) a7f6cdf1 update cached-path (#5477) 844acfa9 Update filelock requirement from <3.4,>=3.3 to >=3.3,<3.5 (#5469) 05fc7f62 Bump fairscale from 0.4.0 to 0.4.2 (#5461) 923dbde0 Bump black from 21.9b0 to 21.10b0 (#5453) 09e22aa6 Update spacy requirement from <3.2,>=2.1.0 to >=2.1.0,<3.3 (#5460) 54b92ae7 HF now raises ValueError (#5464)

    Source code(tar.gz)
    Source code(zip)
  • v2.8.0(Nov 1, 2021)

    What's new

    Added 🎉

    • Added support to push models directly to the Hugging Face Hub with the command allennlp push-to-hf.
    • More default tests for the TextualEntailmentSuite.

    Changed ⚠️

    • The behavior of --overrides has changed. Previously the final configuration params were simply taken as the union over the original params and the --overrides params. But now you can use --overrides to completely replace any part of the original config. For example, passing --overrides '{"model":{"type":"foo"}}' will completely replace the "model" part of the original config. However, when you just want to change a single field in the JSON structure without removing / replacing adjacent fields, you can still use the "dot" syntax. For example, --overrides '{"model.num_layers":3}' will only change the num_layers parameter to the "model" part of the config, leaving everything else unchanged.
    • Integrated cached_path library to replace existing functionality in common.file_utils. This introduces some improvements without any breaking changes.

    Fixed ✅

    • Fixed the implementation of PairedPCABiasDirection in allennlp.fairness.bias_direction, where the difference vectors should not be centered when performing the PCA.

    Commits

    7213d520 Update transformers requirement from <4.12,>=4.1 to >=4.1,<4.13 (#5452) 1b022270 bug fix (#5447) 0d8c0fc5 Update torch requirement from <1.10.0,>=1.6.0 to >=1.6.0,<1.11.0 (#5442) 0c79807c Checklist update (#5438) ebd6b5ba integrate cached_path (#5418) dcd8d9e9 Update mkdocs-material requirement from <7.3.0,>=5.5.0 to >=5.5.0,<7.4.0 (#5419) 362349b5 Registrable _to_params default functionality (#5403) 17ef1aa2 fix a bug when using fp16 training & gradient clipping (#5426) a63e28c2 Update transformers requirement from <4.11,>=4.1 to >=4.1,<4.12 (#5422) 603552fc Add utility function and command to push models to 🤗 Hub (#5370) e5d332a5 Update filelock requirement from <3.1,>=3.0 to >=3.0,<3.2 (#5421) 44155ac6 Make --overrides more flexible (#5399) 43fd9825 Fix PairedPCABiasDirection (#5396) 7785068a Bump black from 21.7b0 to 21.9b0 (#5408) a09d057c Update transformers requirement from <4.10,>=4.1 to >=4.1,<4.11 (#5393) 527e43d9 require Python>=3.7 (#5400) 5338bd8b Add scaling to tqdm bar when downloading files (#5397)

    Source code(tar.gz)
    Source code(zip)
  • v2.7.0(Sep 1, 2021)

    What's new

    Added 🎉

    • Added support to evaluate mutiple datasets and produce corresponding output files in the evaluate command.
    • Added more documentation to the learning rate schedulers to include a sample config object for how to use it.
    • Moved the pytorch learning rate schedulers wrappers to their own file called pytorch_lr_schedulers.py so that they will have their own documentation page.
    • Added a module allennlp.nn.parallel with a new base class, DdpAccelerator, which generalizes PyTorch's DistributedDataParallel wrapper to support other implementations. Two implementations of this class are provided. The default is TorchDdpAccelerator (registered at "torch"), which is just a thin wrapper around DistributedDataParallel. The other is FairScaleFsdpAccelerator, which wraps FairScale's FullyShardedDataParallel. You can specify the DdpAccelerator in the "distributed" section of a configuration file under the key "ddp_accelerator".
    • Added a module allennlp.nn.checkpoint with a new base class, CheckpointWrapper, for implementations of activation/gradient checkpointing. Two implentations are provided. The default implementation is TorchCheckpointWrapper (registered as "torch"), which exposes PyTorch's checkpoint functionality. The other is FairScaleCheckpointWrapper which exposes the more flexible checkpointing funtionality from FairScale.
    • The Model base class now takes a ddp_accelerator parameter (an instance of DdpAccelerator) which will be available as self.ddp_accelerator during distributed training. This is useful when, for example, instantiating submodules in your model's __init__() method by wrapping them with self.ddp_accelerator.wrap_module(). See the allennlp.modules.transformer.t5 for an example.
    • We now log batch metrics to tensorboard and wandb.
    • Added Tango components, to be explored in detail in a later post
    • Added ScaledDotProductMatrixAttention, and converted the transformer toolkit to use it
    • Added tests to ensure that all Attention and MatrixAttention implementations are interchangeable
    • Added a way for AllenNLP Tango to read and write datasets lazily.
    • Added a way to remix datasets flexibly
    • Added from_pretrained_transformer_and_instances constructor to Vocabulary
    • TransformerTextField now supports __len__.

    Fixed ✅

    • Fixed a bug in ConditionalRandomField: transitions and tag_sequence tensors were not initialized on the desired device causing high CPU usage (see https://github.com/allenai/allennlp/issues/2884)
    • Fixed a mispelling: the parameter contructor_extras in Lazy() is now correctly called constructor_extras.
    • Fixed broken links in allennlp.nn.initializers docs.
    • Fixed bug in BeamSearch where last_backpointers was not being passed to any Constraints.
    • TransformerTextField can now take tensors of shape (1, n) like the tensors produced from a HuggingFace tokenizer.
    • tqdm lock is now set inside MultiProcessDataLoading when new workers are spawned to avoid contention when writing output.
    • ConfigurationError is now pickleable.
    • Checkpointer cleaning was fixed to work on Windows Paths
    • Multitask models now support TextFieldTensor in heads, not just in the backbone.
    • Fixed the signature of ScaledDotProductAttention to match the other Attention classes
    • allennlp commands will now catch SIGTERM signals and handle them similar to SIGINT (keyboard interrupt).
    • The MultiProcessDataLoader will properly shutdown its workers when a SIGTERM is received.
    • Fixed the way names are applied to Tango Step instances.
    • Fixed a bug in calculating loss in the distributed setting.
    • Fixed a bug when extending a sparse sequence by 0 items.

    Changed ⚠️

    • The type of the grad_norm parameter of GradientDescentTrainer is now Union[float, bool], with a default value of False. False means gradients are not rescaled and the gradient norm is never even calculated. True means the gradients are still not rescaled but the gradient norm is calculated and passed on to callbacks. A float value means gradients are rescaled.
    • TensorCache now supports more concurrent readers and writers.
    • We no longer log parameter statistics to tensorboard or wandb by default.

    Commits

    48af9d34 Multiple datasets and output files support for the evaluate command (#5340) 60213cd7 Tiny tango tweaks (#5383) 28950215 improve signal handling and worker cleanup (#5378) b41cb3eb Fix distributed loss (#5381) 6355f073 Fix Checkpointer cleaner regex on Windows (#5361) 27da04cf Dataset remix (#5372) 75af38e0 Create Vocabulary from both pretrained transformers and instances (#5368) 5dc80a65 Adds a dataset that can be read and written lazily (#5344) 01e8a35a Improved Documentation For Learning Rate Schedulers (#5365) 8370cfa3 skip loading t5-base in CI (#5371) 13de38d1 Log batch metrics (#5362) 1f5c6e5b Use our own base images to build allennlp Docker images (#5366) bffdbfd1 Bugfix: initializing all tensors and parameters of the ConditionalRandomField model on the proper device (#5335) d45a2dab Make sure that all attention works the same (#5360) c1edaef8 Update google-cloud-storage requirement (#5357) 524244b6 Update wandb requirement from <0.12.0,>=0.10.0 to >=0.10.0,<0.13.0 (#5356) 90bf33b8 small fixes for tango (#5350) 2e11a15e tick version for nightly releases 311f1104 Tango (#5162) 1df2e517 Bump fairscale from 0.3.8 to 0.3.9 (#5337) b72bbfc9 fix constraint bug in beam search, clean up tests (#5328) ec3e2943 Create CITATION.cff (#5336) 8714aa0b This is a desperate attempt to make TensorCache a little more stable (#5334) fd429b2b Update transformers requirement from <4.9,>=4.1 to >=4.1,<4.10 (#5326) 1b5ef3a0 Update spacy requirement from <3.1,>=2.1.0 to >=2.1.0,<3.2 (#5305) 1f20513d TextFieldTensor in multitask models (#5331) 76f2487b set tqdm lock when new workers are spawned (#5330) 67add9d9 Fix ConfigurationError deserialization (#5319) 42d85298 allow TransformerTextField to take input directly from HF tokenizer (#5329) 64043ac6 Bump black from 21.6b0 to 21.7b0 (#5320) 32750550 Update mkdocs-material requirement from <7.2.0,>=5.5.0 to >=5.5.0,<7.3.0 (#5327) 5b1da908 Update links in initializers documentation (#5317) ca656fc6 FairScale integration (#5242)

    Source code(tar.gz)
    Source code(zip)
  • v2.6.0(Jul 19, 2021)

    What's new

    Added 🎉

    • Added on_backward training callback which allows for control over backpropagation and gradient manipulation.
    • Added AdversarialBiasMitigator, a Model wrapper to adversarially mitigate biases in predictions produced by a pretrained model for a downstream task.
    • Added which_loss parameter to ensure_model_can_train_save_and_load in ModelTestCase to specify which loss to test.
    • Added **kwargs to Predictor.from_path(). These key-word argument will be passed on to the Predictor's constructor.
    • The activation layer in the transformer toolkit now can be queried for its output dimension.
    • TransformerEmbeddings now takes, but ignores, a parameter for the attention mask. This is needed for compatibility with some other modules that get called the same way and use the mask.
    • TransformerPooler can now be instantiated from a pretrained transformer module, just like the other modules in the transformer toolkit.
    • TransformerTextField, for cases where you don't care about AllenNLP's advanced text handling capabilities.
    • Added TransformerModule._post_load_pretrained_state_dict_hook() method. Can be used to modify missing_keys and unexpected_keys after loading a pretrained state dictionary. This is useful when tying weights, for example.
    • Added an end-to-end test for the Transformer Toolkit.
    • Added vocab argument to BeamSearch, which is passed to each contraint in constraints (if provided).

    Fixed ✅

    • Fixed missing device mapping in the allennlp.modules.conditional_random_field.py file.
    • Fixed Broken link in allennlp.fairness.fairness_metrics.Separation docs
    • Ensured all allennlp submodules are imported with allennlp.common.plugins.import_plugins().
    • Fixed IndexOutOfBoundsException in MultiOptimizer when checking if optimizer received any parameters.
    • Removed confusing zero mask from VilBERT.
    • Ensured ensure_model_can_train_save_and_load is consistently random.
    • Fixed weight tying logic in T5 transformer module. Previously input/output embeddings were always tied. Now this is optional, and the default behavior is taken from the config.tie_word_embeddings value when instantiating from_pretrained_module().
    • Implemented slightly faster label smoothing.
    • Fixed the docs for PytorchTransformerWrapper
    • Fixed recovering training jobs with models that expect get_metrics() to not be called until they have seen at least one batch.
    • Made the Transformer Toolkit compatible with transformers that don't start their positional embeddings at 0.
    • Weights & Biases training callback ("wandb") now works when resuming training jobs.

    Changed ⚠️

    • Changed behavior of MultiOptimizer so that while a default optimizer is still required, an error is not thrown if the default optimizer receives no parameters.
    • Made the epsilon parameter for the layer normalization in token embeddings configurable.

    Removed 👋

    • Removed TransformerModule._tied_weights. Weights should now just be tied directly in the __init__() method. You can also override TransformerModule._post_load_pretrained_state_dict_hook() to remove keys associated with tied weights from missing_keys after loading a pretrained state dictionary.

    Commits

    ef5400d5 make W&B callback resumable (#5312) 96293407 Update google-cloud-storage requirement (#5309) f8fad9fc Provide vocab as param to constraints (#5321) 56e1f49d Fix training Conditional Random Fields on GPU (#5313) (#5315) 3c1ac032 Update wandb requirement from <0.11.0,>=0.10.0 to >=0.10.0,<0.12.0 (#5316) 7d4a6726 Transformer Toolkit fixes (#5303) aaa816f7 Faster label smoothing (#5294) 436c52d5 Docs update for PytorchTransformerWrapper (#5295) 3d92ac43 Update google-cloud-storage requirement (#5296) 5378533f Fixes recovering when the model expects metrics to be ready (#5293) 7428155a ensure torch always up-to-date in CI (#5286) 3f307ee3 Update README.md (#5288) 672485fb only run CHANGELOG check when source files are modified (#5287) c6865d79 use smaller snapshot for HFHub integration test ad54d48f Bump mypy from 0.812 to 0.910 (#5283) 42d96dfa typo: missing "if" in drop_last doc (#5284) a246e277 TransformerTextField (#5280) 82053a98 Improve weight tying logic in transformer module (#5282) c936da9f Update transformers requirement from <4.8,>=4.1 to >=4.1,<4.9 (#5281) e8f816dd Update google-cloud-storage requirement (#5277) 86504e6b Making model test case consistently random (#5278) 5a7844b5 add kwargs to Predictor.from_path() (#5275) 8ad562e4 Update transformers requirement from <4.7,>=4.1 to >=4.1,<4.8 (#5273) c8b8ed36 Transformer toolkit updates (#5270) 6af9069d update Python environment setup in GitHub Actions (#5272) f1f51fc9 Adversarial bias mitigation (#5269) af101d67 Removes confusing zero mask from VilBERT (#5264) a1d36e67 Update torchvision requirement from <0.10.0,>=0.8.1 to >=0.8.1,<0.11.0 (#5266) e5468d96 Bump black from 21.5b2 to 21.6b0 (#5255) b37686f6 Update torch requirement from <1.9.0,>=1.6.0 to >=1.6.0,<1.10.0 (#5267) 5da5b5ba Upload code coverage reports from different jobs, other CI improvements (#5257) a6cfb122 added on_backward trainer callback (#5249) 8db45e87 Ensure all relevant allennlp submodules are imported with import_plugins() (#5246) 57df0e37 [Docs] Fixes broken link in Fairness_Metrics (#5245) 154f75d7 Bump black from 21.5b1 to 21.5b2 (#5236) 7a5106d5 tick version for nightly release

    Source code(tar.gz)
    Source code(zip)
  • v2.5.0(Jun 3, 2021)

    🆕 AllenNLP v2.5.0 comes with a few big new features and improvements 🆕

    There is a whole new module allennlp.fairness that contains implementations of fairness metrics, bias metrics, and bias mitigation tools for your models thanks to @ArjunSubramonian. For a great introduction, check out the corresponding chapter of the guide: https://guide.allennlp.org/fairness.

    Another major addition is the allennlp.confidence_checks.task_checklists submodule, thanks to @AkshitaB, which provides an automated way to run behavioral tests of your models using the checklist library.

    BeamSearch also has several new important features, including an easy to add arbitrary constraints, thanks to @danieldeutsch.

    See below for a comprehensive list of updates 👇

    What's new

    Added 🎉

    • Added TaskSuite base class and command line functionality for running checklist test suites, along with implementations for SentimentAnalysisSuite, QuestionAnsweringSuite, and TextualEntailmentSuite. These can be found in the allennlp.confidence_checks.task_checklists module.
    • Added BiasMitigatorApplicator, which wraps any Model and mitigates biases by finetuning on a downstream task.
    • Added allennlp diff command to compute a diff on model checkpoints, analogous to what git diff does on two files.
    • Meta data defined by the class allennlp.common.meta.Meta is now saved in the serialization directory and archive file when training models from the command line. This is also now part of the Archive named tuple that's returned from load_archive().
    • Added nn.util.distributed_device() helper function.
    • Added allennlp.nn.util.load_state_dict helper function.
    • Added a way to avoid downloading and loading pretrained weights in modules that wrap transformers such as the PretrainedTransformerEmbedder and PretrainedTransformerMismatchedEmbedder. You can do this by setting the parameter load_weights to False. See PR #5172 for more details.
    • Added SpanExtractorWithSpanWidthEmbedding, putting specific span embedding computations into the _embed_spans method and leaving the common code in SpanExtractorWithSpanWidthEmbedding to unify the arguments, and modified BidirectionalEndpointSpanExtractor, EndpointSpanExtractor and SelfAttentiveSpanExtractor accordingly. Now, SelfAttentiveSpanExtractor can also embed span widths.
    • Added a min_steps parameter to BeamSearch to set a minimum length for the predicted sequences.
    • Added the FinalSequenceScorer abstraction to calculate the final scores of the generated sequences in BeamSearch.
    • Added shuffle argument to BucketBatchSampler which allows for disabling shuffling.
    • Added allennlp.modules.transformer.attention_module which contains a generalized AttentionModule. SelfAttention and T5Attention both inherit from this.
    • Added a Constraint abstract class to BeamSearch, which allows for incorporating constraints on the predictions found by BeamSearch, along with a RepeatedNGramBlockingConstraint constraint implementation, which allows for preventing repeated n-grams in the output from BeamSearch.
    • Added DataCollator for dynamic operations for each batch.

    Changed ⚠️

    • Use dist_reduce_sum in distributed metrics.
    • Allow Google Cloud Storage paths in cached_path ("gs://...").
    • Renamed nn.util.load_state_dict() to read_state_dict to avoid confusion with torch.nn.Module.load_state_dict().
    • TransformerModule.from_pretrained_module now only accepts a pretrained model ID (e.g. "bert-base-case") instead of an actual torch.nn.Module. Other parameters to this method have changed as well.
    • Print the first batch to the console by default.
    • Renamed sanity_checks to confidence_checks (sanity_checks is deprecated and will be removed in AllenNLP 3.0).
    • Trainer callbacks can now store and restore state in case a training run gets interrupted.
    • VilBERT backbone now rolls and unrolls extra dimensions to handle input with > 3 dimensions.
    • BeamSearch is now a Registrable class.

    Fixed ✅

    • When PretrainedTransformerIndexer folds long sequences, it no longer loses the information from token type ids.
    • Fixed documentation for GradientDescentTrainer.cuda_device.
    • Re-starting a training run from a checkpoint in the middle of an epoch now works correctly.
    • When using the "moving average" weights smoothing feature of the trainer, training checkpoints would also get smoothed, with strange results for resuming a training job. This has been fixed.
    • When re-starting an interrupted training job, the trainer will now read out the data loader even for epochs and batches that can be skipped. We do this to try to get any random number generators used by the reader or data loader into the same state as they were the first time the training job ran.
    • Fixed the potential for a race condition with cached_path() when extracting archives. Although the race condition is still possible if used with force_extract=True.
    • Fixed wandb callback to work in distributed training.
    • Fixed tqdm logging into multiple files with allennlp-optuna.

    Commits

    b92fd9a7 Contextualized bias mitigation (#5176) aa52a9a0 Checklist fixes (#5239) 62067973 Fix tqdm logging into multiple files with allennlp-optuna (#5235) b0aa1d45 Generalize T5 modules (#5166) 5b111d08 tick version for nightly release 39d7e5ae Make BeamSearch Registrable (#5231) c0142320 Add constraints to beam search (#5216) 98dae7f4 Emergency fix. I forgot to take this out. c5bff8ba Fixes Checkpointing (#5220) 3d5799d8 Roll backbone (#5229) babc450d Added DataCollator for dynamic operations for each batch. (#5221) d97ed401 Bump checklist from 0.0.10 to 0.0.11 (#5222) 12155c40 fix race condition when extracting files with cached_path (#5227) d6629772 cancel redundant GH Actions workflows (#5226) 2d8f3904 Fix W&B callback for distributed training (#5223) 59df2ad3 Update nr-interface requirement from <0.0.4 to <0.0.6 (#5213) 3e1b553b Bump black from 20.8b1 to 21.5b1 (#5195) d2840cba save meta data with model archives (#5209) bd941c6f added shuffle disable option in BucketBatchSampler (#5212) 3585c9fe Implementing abstraction to score final sequences in BeamSearch (#5208) 79d16af1 Add a min_steps parameter to BeamSearch (#5207) cf113d70 Changes and improvements to how we initialize transformer modules from pretrained models (#5200) cccb35de Rename sanity_checks to confidence_checks (#5201) db8ff675 Update transformers requirement from <4.6,>=4.1 to >=4.1,<4.7 (#5199) fd5c9e4c Bias Metrics (#5139) d9b19b69 Bias Mitigation and Direction Methods (#5130) 74737373 add diff command (#5109) d85c5c3a Explicitly pass serialization directory and local rank to trainer in train command (#5180) 96c3caf9 fix nltk downloads in install (#5189) b1b455a2 improve contributing guide / PR template (#5185) 7a260da9 fix cuda_device docs (#5188) 0bf590df Update Makefile (#5183) 3335700c Default print first batch (#5175) b533733a Refactor span extractors and unify forward. (#5160) 01b232fb Allow google cloud storage locations for cached_path (#5173) eb2ae30e Update README.md (#5165) 55efa683 fix dataclasses import (#5169) a463e0e7 Add way of skipping pretrained weights download (#5172) c71bb460 improve err msg for PolynomialDecay LR scheduler (#5143) 530dae43 Simplify metrics (#5154) 12f5b0f5 Run some slow tests on the self-hosted runner (#5161) 90915800 Fixes token type ids for folded sequences (#5149) 10400e02 Run checklist suites in AllenNLP (#5065) d11359ed make dist_reduce_sum work for tensors (#5147) 9184fbcb Fixes Backbone / Model MRO inconsistency (#5148)

    Source code(tar.gz)
    Source code(zip)
  • v2.4.0(Apr 23, 2021)

    What's new

    Added 🎉

    • Added a T5 implementation to modules.transformers.

    Changed ⚠️

    • Weights & Biases callback can now work in anonymous mode (i.e. without the WANDB_API_KEY environment variable).

    Fixed ✅

    • The GradientDescentTrainer no longer leaves stray model checkpoints around when it runs out of patience.
    • Fixed cached_path() for "hf://" files.

    Commits

    7c5cc98a Don't orphan checkpoints when we run out of patience (#5142) 6ec64596 allow W&B anon mode (#5110) 4e862a54 T5 (#4969) 7fc5a91f fix cached_path for hub downloads (#5141) f877fdc3 Fairness Metrics (#5093)

    Source code(tar.gz)
    Source code(zip)
  • v2.3.1(Apr 20, 2021)

    What's new

    Added 🎉

    • Added support for the HuggingFace Hub as an alternative way to handle loading files through cached_path(). Hub downloads should be made through the hf:// URL scheme.
    • Add new dimension to the interpret module: influence functions via the InfluenceInterpreter base class, along with a concrete implementation: SimpleInfluence.
    • Added a quiet parameter to the MultiProcessDataLoading that disables Tqdm progress bars.
    • The test for distributed metrics now takes a parameter specifying how often you want to run it.

    Changed ⚠️

    • Updated CONTRIBUTING.md to remind reader to upgrade pip setuptools to avoid spaCy installation issues.

    Fixed ✅

    • Fixed a bug with the ShardedDatasetReader when used with multi-process data loading (https://github.com/allenai/allennlp/issues/5132).

    Commits

    a84b9b1a Add cached_path support for HF hub (#5052) 24ec7db4 fix #5132 (#5134) 2526674f Update CONTRIBUTING.md (#5133) c2ffb101 Add influence functions to interpret module (#4988) 0c7d60bc Take the number of runs in the test for distributed metrics (#5127) 8be3828f fix docs CI

    Source code(tar.gz)
    Source code(zip)
  • v2.3.0(Apr 14, 2021)

    What's new

    Added 🎉

    • Ported the following Huggingface LambdaLR-based schedulers: ConstantLearningRateScheduler, ConstantWithWarmupLearningRateScheduler, CosineWithWarmupLearningRateScheduler, CosineHardRestartsWithWarmupLearningRateScheduler.
    • Added new sub_token_mode parameter to pretrained_transformer_mismatched_embedder class to support first sub-token embedding
    • Added a way to run a multi task model with a dataset reader as part of allennlp predict.
    • Added new eval_mode in PretrainedTransformerEmbedder. If it is set to True, the transformer is always run in evaluation mode, which, e.g., disables dropout and does not update batch normalization statistics.
    • Added additional parameters to the W&B callback: entity, group, name, notes, and wandb_kwargs.

    Changed ⚠️

    • Sanity checks in the GradientDescentTrainer can now be turned off by setting the run_sanity_checks parameter to False.
    • Allow the order of examples in the task cards to be specified explicitly
    • histogram_interval parameter is now deprecated in TensorboardWriter, please use distribution_interval instead.
    • Memory usage is not logged in tensorboard during training now. ConsoleLoggerCallback should be used instead.
    • If you use the min_count parameter of the Vocabulary, but you specify a namespace that does not exist, the vocabulary creation will raise a ConfigurationError.
    • Documentation updates made to SoftmaxLoss regarding padding and the expected shapes of the input and output tensors of forward.
    • Moved the data preparation script for coref into allennlp-models.
    • If a transformer is not in cache but has override weights, the transformer's pretrained weights are no longer downloaded, that is, only its config.json file is downloaded.
    • SanityChecksCallback now raises SanityCheckError instead of AssertionError when a check fails.
    • jsonpickle removed from dependencies.
    • Improved the error message from Registrable.by_name() when the name passed does not match any registered subclassess. The error message will include a suggestion if there is a close match between the name passed and a registered name.

    Fixed ✅

    • Fixed a bug where some Activation implementations could not be pickled due to involving a lambda function.
    • Fixed __str__() method on ModelCardInfo class.
    • Fixed a stall when using distributed training and gradient accumulation at the same time
    • Fixed an issue where using the from_pretrained_transformer Vocabulary constructor in distributed training via the allennlp train command would result in the data being iterated through unnecessarily.
    • Fixed a bug regarding token indexers with the InterleavingDatasetReader when used with multi-process data loading.
    • Fixed a warning from transformers when using max_length in the PretrainedTransformerTokenizer.

    Removed 👋

    • Removed the stride parameter to PretrainedTransformerTokenizer. This parameter had no effect.

    Commits

    c80e1751 improve error message from Registrable class (#5125) aca16237 Update docstring for basic_classifier (#5124) 059a64fc remove jsonpickle from dependencies (#5121) 5fdce9ad fix bug with interleaving dataset reader (#5122) 6e1f34cb Predicting with a dataset reader on a multitask model (#5115) b34df73e specify 'truncation' to avoid transformers warning (#5120) 0ddd3d35 Add eval_mode argument to pretrained transformer embedder (#5111) 99415e36 additional W&B params (#5114) 6ee12123 Adding a metadata field to the basic classifier (#5104) 2e8c3e2f Add link to gallery and demo in README (#5103) de611008 Distributed training with gradient accumulation (#5100) fe2d6e5a vocab fix (#5099) d906175d Update transformers requirement from <4.5,>=4.1 to >=4.1,<4.6 (#5102) 99da3156 fix str method of ModelCardInfo (#5096) 29f00ee2 Added new parameter 'sub_token_mode' to 'pretrained_transformer_mismatched_embedder' class to support first sub-token embedding (#4363) (#5087) 6021f7d4 Avoid from_pretrained download of model weights (#5085) c3fb97eb add SanityCheckError class (#5092) decb875b Bring back run_sanity_checks parameter (#5091) 913fb8a4 Update mkdocs-material requirement from <7.1.0,>=5.5.0 to >=5.5.0,<7.2.0 (#5074) f82d3f11 remove lambdas from activations (#5083) bb703494 Replace master references with main in issue template (#5084) 87504c42 Ported Huggingface LambdaLR-based schedulers (#5082) 63a3b489 set transformer to evaluation mode (#5073) 542ce5d9 Move coref prep script (#5078) bf8e71e9 compare namespace in counter and min_count (#3644) 4baf19ab Arjuns/softmax loss documentation update (#5075) 59b92106 Allow example categories to be ordered (#5059) 3daa0baf tick version for nightly bb77bd10 fix date in CHANGELOG

    Source code(tar.gz)
    Source code(zip)
  • v2.2.0(Mar 26, 2021)

    What's new

    Added 🎉

    • Added WandBCallback class for Weights & Biases integration, registered as a callback under the name "wandb".
    • Added TensorBoardCallback to replace the TensorBoardWriter. Registered as a callback under the name "tensorboard".
    • Added NormalizationBiasVerification and SanityChecksCallback for model sanity checks.
    • SanityChecksCallback runs by default from the allennlp train command. It can be turned off by setting trainer.enable_default_callbacks to false in your config.
    • Added new method on Field class: .human_readable_repr() -> Any, and new method on Instance class: .human_readable_dict() -> JsonDict (@leo-liuzy).

    Removed 👋

    • Removed TensorBoardWriter. Please use the TensorBoardCallback instead.

    Changed ⚠️

    • Use attributes of ModelOutputs object in PretrainedTransformerEmbedder instead of indexing (@JohnGiorgi).
    • Added support for PyTorch version 1.8 and torchvision version 0.9 (@nelson-liu).
    • Model.get_parameters_for_histogram_tensorboard_logging is deprecated in favor of Model.get_parameters_for_histogram_logging.

    Fixed ✅

    • Makes sure tensors that are stored in TensorCache always live on CPUs.
    • Fixed a bug where FromParams objects wrapped in Lazy() couldn't be pickled.
    • Fixed a bug where the ROUGE metric couldn't be picked.
    • Fixed a bug reported by https://github.com/allenai/allennlp/issues/5036 - we now keep our spacy POS tagger on (@leo-liuzy).

    Commits

    c5c9df58 refactor LogWriter, add W&B integration (#5061) 385124ad Keep Spacy PoS tagger on by default (#5066) 15b532fb Update transformers requirement from <4.4,>=4.1 to >=4.1,<4.5 (#5057) 3aafb927 clarify how predictions_to_labeled_instances work for targeted or non-targeted hotflip attack (#4957) b897e57c ensure ROUGE metric can be pickled (#5051) 91e4af94 fix pickle bug for Lazy FromParams (#5049) 5b57be29 Adding normalization bias verification (#4990) ce71901a Update torchvision requirement from <0.9.0,>=0.8.1 to >=0.8.1,<0.10.0 (#5041) 7f609901 Update torch requirement from <1.8.0,>=1.6.0 to >=1.6.0,<1.9.0 (#5037) 96415b2b Use HF Transformers output types (#5035) 0c36019c clean up (#5034) d2bf35d1 Add methods for human readable representation of fields and instances (#4986) a8b80069 Makes sure serialized tensors live on CPUs (#5026) a0edfae9 Add options to log inputs in trainer (#4970)


    Thanks to @nelson-liu for making sure we stay on top of releases! 😜

    Source code(tar.gz)
    Source code(zip)
  • v1.5.0(Mar 1, 2021)

    What's new

    Added 🎉

    • Added a way to specify extra parameters to the predictor in an allennlp predict call.
    • Added a way to initialize a Vocabulary from transformers models.
    • Support spaCy v3

    Changed ⚠️

    • Updated Paper and Dataset classes in ModelCard.

    Commits

    55ac96a0 re-write docs commit history on releases (#4968) c61178fa Update spaCy to 3.0 (#4953) be595dfd Ensure mean absolute error metric returns a float (#4983) 25562234 raise on HTTP errors in cached_path (#4984) e1839cfe Inputs to the FBetaMultiLabel metric were copied and pasted wrong (#4975) b5b72a06 Add method to vocab to instantiate from a pretrained transformer (#4958) 025a0b28 Allows specifying extra arguments for predictors (#4947) 24c9c995 adding ModelUsage, rearranging fields (#4952)

    Source code(tar.gz)
    Source code(zip)
  • v2.1.0(Feb 24, 2021)

    What's new

    Changed ⚠️

    • coding_scheme parameter is now deprecated in Conll2003DatasetReader, please use convert_to_coding_scheme instead.
    • Support spaCy v3

    Added 🎉

    • Added ModelUsage to ModelCard class.
    • Added a way to specify extra parameters to the predictor in an allennlp predict call.
    • Added a way to initialize a Vocabulary from transformers models.
    • Added the ability to use Predictors with multitask models through the new MultiTaskPredictor.
    • Added an example for fields of type ListField[TextField] to apply_token_indexers API docs.
    • Added text_key and label_key parameters to TextClassificationJsonReader class.
    • Added MultiOptimizer, which allows you to use different optimizers for different parts of your model.

    Fixed ✅

    • @Registrable.register(...) decorator no longer masks the decorated class's annotations
    • Ensured that MeanAbsoluteError always returns a float metric value instead of a Tensor.
    • Learning rate schedulers that rely on metrics from the validation set were broken in v2.0.0. This brings that functionality back.
    • Fixed a bug where the MultiProcessDataLoading would crash when num_workers > 0, start_method = "spawn", max_instances_in_memory not None, and batches_per_epoch not None.
    • Fixed documentation and validation checks for FBetaMultiLabelMetric.
    • Fixed handling of HTTP errors when fetching remote resources with cached_path(). Previously the content would be cached even when certain errors - like 404s - occurred. Now an HTTPError will be raised whenever the HTTP response is not OK.
    • Fixed a bug where the MultiTaskDataLoader would crash when num_workers > 0
    • Fixed an import error that happens when PyTorch's distributed framework is unavailable on the system.

    Commits

    7c6adeff Fix worker_info bug when num_workers > 0 (#5013) 9d88f8c5 Fixes predictors in the multitask case (#4991) 678518a0 Less opaque registrable annotations (#5010) 4b5fad46 Regex optimizer (#4981) f091cb9c fix error when torch.distributed not available (#5011) 5974f54e Revert "drop support for Python 3.6 (#5012)" (#5016) bdb0e20a Update mkdocs-material requirement from <6.3.0,>=5.5.0 to >=5.5.0,<7.1.0 (#5015) d535de67 Bump mypy from 0.800 to 0.812 (#5007) 099786cf Update responses requirement, remove pin on urllib3 (#4783) b8cfb95c re-write docs commit history on releases (#4968) c5c9edf0 Add text_key and label_key to TextClassificationJsonReader (#5005) a02f67da drop support for Python 3.6 (#5012) 0078c595 Update spaCy to 3.0 (#4953) be9537f6 Update CHANGELOG.md 828ee101 Update CHANGELOG.md 1cff6ad9 update README (#4993) f8b38075 Add ListField example to apply token indexers (#4987) 7961b8b7 Ensure mean absolute error metric returns a float (#4983) da4dba15 raise on HTTP errors in cached_path (#4984) d4926f5e Inputs to the FBetaMultiLabel metric were copied and pasted wrong (#4975) d2ae540d Update transformers requirement from <4.3,>=4.1 to >=4.1,<4.4 (#4967) bf8eeafe Add method to vocab to instantiate from a pretrained transformer (#4958) 9267ce7c Resize transformers word embeddings layer for additional_special_tokens (#4946) 52c23dd2 Introduce convert_to_coding_scheme and make coding_scheme deprecated in CoNLL2003DatasetReader (#4960) c418f84b Fixes recording validation metrics for learning rate schedulers that rely on it (#4959) 4535f5c8 adding ModelUsage, rearranging fields (#4952) 1ace4bbb fix bug with MultiProcessDataLoader (#4956) 6f222919 Allows specifying extra arguments for predictors (#4947) 2731db12 tick version for nightly release

    Source code(tar.gz)
    Source code(zip)
  • v2.0.1(Jan 29, 2021)

    What's new

    A couple minors fixes and additions since the 2.0 release.

    Added 🎉

    • Added tokenizer_kwargs and transformer_kwargs arguments to PretrainedTransformerBackbone

    Changed ⚠️

    • GradientDescentTrainer makes serialization_dir when it's instantiated, if it doesn't exist.

    Fixed ✅

    • common.util.sanitize now handles sets.

    Commits

    caa497f3 Update GradientDescentTrainer to automatically create directory for serialization_dir (#4940) cd96d953 Sanitize set (#4945) f0ae9f3c Adding tokenizer_kwargs argument to PretrainedTransformerBackbone constructor. (#4944) 501b0ab4 Fixing papers and datasets (#4919) fa625ec0 Adding missing transformer_kwargs arg that was recently added to PretrainedTransformerEmbedder (#4941) 96ea4839 Add missing "Unreleased" section to CHANGELOG

    Source code(tar.gz)
    Source code(zip)
  • v1.4.1(Jan 29, 2021)

    What's new

    Note: This release is mainly for the AllenNLP demo.

    Changed ⚠️

    • Updated Paper and Dataset classes in ModelCard.

    Commits

    14b717c8 Update GradientDescentTrainer to automatically create directory for serialization_dir (#4940) e262352f Fixing papers and datasets (#4919)

    Source code(tar.gz)
    Source code(zip)
  • v2.0.0(Jan 27, 2021)

    AllenNLP v2.0.0 Release Notes

    The 2.0 release of AllenNLP represents a major engineering effort that brings several exciting new features to the library, as well as a focus on performance.

    If you're upgrading from AllenNLP 1.x, we encourage you to read our comprehensive upgrade guide.

    Main new features

    AllenNLP gets eyes 👀

    One of the most exciting areas in ML research is multimodal learning, and AllenNLP is now taking its first steps in this direction with support for 2 tasks and 3 datasets in the vision + text domain. Check out our ViLBERT for VQA and Visual Entailment models, along with the VQAv2, Visual Entailment, and GQA dataset readers in allennlp-models.

    Transformer toolkit

    The transformer toolkit offers a collection of modules to experiment with various transformer architectures, such as SelfAttention, TransformerEmbeddings, TransformerLayer, etc. It also simplifies the way one can take apart the pretrained transformer weights for an existing module, and combine them in different ways. For instance, one can pull out the first 8 layers of bert-base-uncased to separately encode two text inputs, combine the representations in some way, and then use the last 4 layers on the combined representation (More examples can be found in allennlp.modules.transformer).

    The toolkit also contains modules for bimodal architectures such as ViLBERT. Modules include BiModalEncoder, which encodes two modalities separately, and performs bi-directional attention (BiModalAttention) using a connection layer (BiModalConnectionLayer). The VisionTextModel class is an example of a model that uses these bimodal layers.

    Multi-task learning

    2.0 adds support for multi-task learning throughout the AllenNLP system. In multi-task learning, the model consists of a backbone that is common to all the tasks, and tends to be the larger part of the model, and multiple task-specific heads that use the output of the backbone to make predictions for a specific task. This way, the backbone gets many more training examples than you might have available for a single task, and can thus produce better representations, which makes all tasks benefit. The canonical example for this is BERT, where the backbone is made up of the transformer stack, and then there are multiple model heads that do classification, tagging, masked-token prediction, etc. AllenNLP 2.0 helps you build such models by giving you those abstractions. The MultiTaskDatasetReader can read datasets for multiple tasks at once. The MultiTaskDataloader loads the instances from the reader and makes batches. The trainer feeds these batches to a MultiTaskModel, which consists of a Backbone and multiple Heads. If you want to look at the details of how this works, we have an example config available at https://github.com/allenai/allennlp-models/blob/main/training_config/vision/vilbert_multitask.jsonnet.

    Changes since v2.0.0rc1

    Added 🎉

    • The TrainerCallback constructor accepts serialization_dir provided by Trainer. This can be useful for Logger callbacks those need to store files in the run directory.
    • The TrainerCallback.on_start() is fired at the start of the training.
    • The TrainerCallback event methods now accept **kwargs. This may be useful to maintain backwards-compability of callbacks easier in the future. E.g. we may decide to pass the exception/traceback object in case of failure to on_end() and this older callbacks may simply ignore the argument instead of raising a TypeError.
    • Added a TensorBoardCallback which wraps the TensorBoardWriter.

    Changed ⚠️

    • The TrainerCallack.on_epoch() does not fire with epoch=-1 at the start of the training. Instead, TrainerCallback.on_start() should be used for these cases.
    • TensorBoardBatchMemoryUsage is converted from BatchCallback into TrainerCallback.
    • TrackEpochCallback is converted from EpochCallback into TrainerCallback.
    • Trainer can accept callbacks simply with name callbacks instead of trainer_callbacks.
    • TensorboardWriter renamed to TensorBoardWriter, and removed as an argument to the GradientDescentTrainer. In order to enable TensorBoard logging during training, you should utilize the TensorBoardCallback instead.

    Removed 👋

    • Removed EpochCallback, BatchCallback in favour of TrainerCallback. The metaclass-wrapping implementation is removed as well.
    • Removed the tensorboard_writer parameter to GradientDescentTrainer. You should use the TensorBoardCallback now instead.

    Fixed ✅

    • Now Trainer always fires TrainerCallback.on_end() so all the resources can be cleaned up properly.
    • Fixed the misspelling, changed TensoboardBatchMemoryUsage to TensorBoardBatchMemoryUsage.
    • We set a value to epoch so in case of firing TrainerCallback.on_end() the variable is bound. This could have lead to an error in case of trying to recover a run after it was finished training.

    Commits since v2.0.0rc1

    15300823 Log to TensorBoard through a TrainerCallback in GradientDescentTrainer (#4913) 8b95316b ci quick fix fa1dc7b8 Add link to upgrade guide to README (#4934) 7364da03 Fix parameter name in the documentation 00e3ff27 tick version for nightly release 67fa291c Merging vision into main (#4800) 65e50b30 Bump mypy from 0.790 to 0.800 (#4927) a7445357 fix mkdocs config (#4923) ed322eba A helper for distributed reductions (#4920) 9ab2bf03 add CUDA 10.1 Docker image (#4921) d82287e5 Update transformers requirement from <4.1,>=4.0 to >=4.0,<4.2 (#4872) 4183a49c Update mkdocs-material requirement from <6.2.0,>=5.5.0 to >=5.5.0,<6.3.0 (#4880) 54e85eee disable codecov annotations (#4902) 2623c4bf Making TrackEpochCallback an EpochCallback (#4893) 1d21c759 issue warning instead of failing when lock can't be acquired on a resource that exists in a read-only file system (#4867) ec197c3b Create pull_request_template.md (#4891) 9cf41b2f fix navbar link 9635af82 rename 'master' -> 'main' (#4887) d0a07fb3 docs: fix simple typo, multplication -> multiplication (#4883) d1f032d8 Moving modelcard and taskcard abstractions to main repo (#4881) 1fff7cae Update docker torch version (#4873) d2aea979 Fix typo in str (#4874) 6a8d425f add CombinedLearningRateScheduler (#4871) a3732d00 Fix cache volume (#4869) 832901e8 Turn superfluous warning to info when extending the vocab in the embedding matrix (#4854)

    Source code(tar.gz)
    Source code(zip)
  • v1.4.0(Jan 27, 2021)

    What's new

    Added 🎉

    • Added a FileLock class to common.file_utils. This is just like the FileLock from the filelock library, except that it adds an optional flag read_only_ok: bool, which when set to True changes the behavior so that a warning will be emitted instead of an exception when lacking write permissions on an existing file lock. This makes it possible to use the FileLock class on a read-only file system.
    • Added a new learning rate scheduler: CombinedLearningRateScheduler. This can be used to combine different LR schedulers, using one after the other.
    • Added an official CUDA 10.1 Docker image.
    • Moving ModelCard and TaskCard abstractions into the main repository.
    • Added a util function allennlp.nn.util.dist_reduce(...) for handling distributed reductions. This is especially useful when implementing a distributed Metric.

    Changed ⚠️

    • 'master' branch renamed to 'main'
    • Torch version bumped to 1.7.1 in Docker images.

    Fixed ✅

    • Fixed typo with LabelField string representation: removed trailing apostrophe.
    • Vocabulary.from_files and cached_path will issue a warning, instead of failing, when a lock on an existing resource can't be acquired because the file system is read-only.
    • TrackEpochCallback is now a EpochCallback.

    Commits

    4de78ac0 Make CI run properly on the 1.x branch 65e50b30 Bump mypy from 0.790 to 0.800 (#4927) a7445357 fix mkdocs config (#4923) ed322eba A helper for distributed reductions (#4920) 9ab2bf03 add CUDA 10.1 Docker image (#4921) d82287e5 Update transformers requirement from <4.1,>=4.0 to >=4.0,<4.2 (#4872) 4183a49c Update mkdocs-material requirement from <6.2.0,>=5.5.0 to >=5.5.0,<6.3.0 (#4880) 54e85eee disable codecov annotations (#4902) 2623c4bf Making TrackEpochCallback an EpochCallback (#4893) 1d21c759 issue warning instead of failing when lock can't be acquired on a resource that exists in a read-only file system (#4867) ec197c3b Create pull_request_template.md (#4891) 9cf41b2f fix navbar link 9635af82 rename 'master' -> 'main' (#4887) d0a07fb3 docs: fix simple typo, multplication -> multiplication (#4883) d1f032d8 Moving modelcard and taskcard abstractions to main repo (#4881) 1fff7cae Update docker torch version (#4873) d2aea979 Fix typo in str (#4874) 6a8d425f add CombinedLearningRateScheduler (#4871) a3732d00 Fix cache volume (#4869) 832901e8 Turn superfluous warning to info when extending the vocab in the embedding matrix (#4854)

    Source code(tar.gz)
    Source code(zip)
  • v2.0.0rc1(Jan 22, 2021)

    This is the first (and hopefully only) release candidate for AllenNLP 2.0. Please note that this is a release candidate, and the APIs are still subject to change until the final 2.0 release. We'll provide a detailed writeup with the final 2.0 release, including a migration guide. In the meantime, here are the headline features of AllenNLP 2.0:

    • Support for models that combine language and vision features
    • Transformer Toolkit, a suite of classes and components that make it easy to experiment with transformer architectures
    • A framework for multitask training
    • Revamped data loading, for improved performance and flexibility

    What's new

    Added 🎉

    • Added TensorCache class for caching tensors on disk
    • Added abstraction and concrete implementation for image loading
    • Added abstraction and concrete implementation for GridEmbedder
    • Added abstraction and demo implementation for an image augmentation module.
    • Added abstraction and concrete implementation for region detectors.
    • A new high-performance default DataLoader: MultiProcessDataLoading.
    • A MultiTaskModel and abstractions to use with it, including Backbone and Head. The MultiTaskModel first runs its inputs through the Backbone, then passes the result (and whatever other relevant inputs it got) to each Head that's in use.
    • A MultiTaskDataLoader, with a corresponding MultiTaskDatasetReader, and a couple of new configuration objects: MultiTaskEpochSampler (for deciding what proportion to sample from each dataset at every epoch) and a MultiTaskScheduler (for ordering the instances within an epoch).
    • Transformer toolkit to plug and play with modular components of transformer architectures.
    • Added a command to count the number of instances we're going to be training with
    • Added a FileLock class to common.file_utils. This is just like the FileLock from the filelock library, except that it adds an optional flag read_only_ok: bool, which when set to True changes the behavior so that a warning will be emitted instead of an exception when lacking write permissions on an existing file lock. This makes it possible to use the FileLock class on a read-only file system.
    • Added a new learning rate scheduler: CombinedLearningRateScheduler. This can be used to combine different LR schedulers, using one after the other.
    • Added an official CUDA 10.1 Docker image.
    • Moving ModelCard and TaskCard abstractions into the main repository.
    • Added a util function allennlp.nn.util.dist_reduce(...) for handling distributed reductions. This is especially useful when implementing a distributed Metric.

    Changed ⚠️

    • DatasetReaders are now always lazy. This means there is no lazy parameter in the base class, and the _read() method should always be a generator.
    • The DataLoader now decides whether to load instances lazily or not. With the PyTorchDataLoader this is controlled with the lazy parameter, but with the MultiProcessDataLoading this is controlled by the max_instances_in_memory setting.
    • ArrayField is now called TensorField, and implemented in terms of torch tensors, not numpy.
    • Improved nn.util.move_to_device function by avoiding an unnecessary recursive check for tensors and adding a non_blocking optional argument, which is the same argument as in torch.Tensor.to().
    • If you are trying to create a heterogeneous batch, you now get a better error message.
    • Readers using the new vision features now explicitly log how they are featurizing images.
    • master_addr and master_port renamed to primary_addr and primary_port, respectively.
    • is_master parameter for training callbacks renamed to is_primary.
    • master branch renamed to main
    • Torch version bumped to 1.7.1 in Docker images.

    Removed 👋

    • Removed nn.util.has_tensor.

    Fixed ✅

    • The build-vocab command no longer crashes when the resulting vocab file is in the current working directory.
    • Fixed typo with LabelField string representation: removed trailing apostrophe.
    • Vocabulary.from_files and cached_path will issue a warning, instead of failing, when a lock on an existing resource can't be acquired because the file system is read-only.
    • TrackEpochCallback is now a EpochCallback.

    Commits

    9a4a424d Moves vision models to allennlp-models (#4918) 412896bc fix merge conflicts ed322eba A helper for distributed reductions (#4920) 9ab2bf03 add CUDA 10.1 Docker image (#4921) d82287e5 Update transformers requirement from <4.1,>=4.0 to >=4.0,<4.2 (#4872) 54973947 Multitask example (#4898) 0f00d4d4 resolve _read type (#4916) 5229da83 Toolkit decoder (#4914) 4183a49c Update mkdocs-material requirement from <6.2.0,>=5.5.0 to >=5.5.0,<6.3.0 (#4880) d7c9eab3 improve worker error handling in MultiProcessDataLoader (#4912) 94dd9cc7 rename 'master' -> 'primary' for distributed training (#4910) c9585afd fix imports in file_utils 03c7ffb5 Merge branch 'main' into vision effcc4e5 improve data loading docs (#4909) 2f545701 remove PyTorchDataLoader, add SimpleDataLoader for testing (#4907) 31ec6a59 MultiProcessDataLoader takes PathLike data_path (#4908) 5e3757b4 rename 'multi_process_*' -> 'multiprocess' for consistency (#4906) df36636e Data loading cuda device (#4879) aedd3be1 Toolkit: Cleaning up TransformerEmbeddings (#4900) 54e85eee disable codecov annotations (#4902) 2623c4bf Making TrackEpochCallback an EpochCallback (#4893) 1d21c759 issue warning instead of failing when lock can't be acquired on a resource that exists in a read-only file system (#4867) ec197c3b Create pull_request_template.md (#4891) 15d32da1 Make GQA work (#4884) fbab0bd9 import MultiTaskDataLoader to data_loaders/init.py (#4885) d1cc1469 Merge branch 'main' into vision abacc01b Adding f1 score (#4890) 9cf41b2f fix navbar link 9635af82 rename 'master' -> 'main' (#4887) d0a07fb3 docs: fix simple typo, multplication -> multiplication (#4883) d1f032d8 Moving modelcard and taskcard abstractions to main repo (#4881) f62b819f Make images easier to find for Visual Entailment (#4878) 1fff7cae Update docker torch version (#4873) 7a7c7ea8 Only cache, no featurizing (#4870) d2aea979 Fix typo in str (#4874) 1c72a302 Merge branch 'master' into vision 6a8d425f add CombinedLearningRateScheduler (#4871) 85d38ff6 doc fixes c4e3f77f Switch to torchvision for vision components 👀, simplify and improve MultiProcessDataLoader (#4821) 3da8e622 Merge branch 'master' into vision a3732d00 Fix cache volume (#4869) 832901e8 Turn superfluous warning to info when extending the vocab in the embedding matrix (#4854) 147fefe6 Merge branch 'master' into vision 87e35360 Make tests work again (#4865) d16a5c78 Merge remote-tracking branch 'origin/master' into vision 457e56ef Merge branch 'master' into vision c8521d80 Toolkit: Adding documentation and small changes for BiModalAttention (#4859) ddbc7404 gqa reader fixes during vilbert training (#4851) 50e50df6 Generalizing transformer layers (#4776) 52fdd755 adding multilabel option (#4843) 78871195 Other VQA datasets (#4834) e729e9a4 Added GQA reader (#4832) 52e9dd92 Visual entailment model code (#4822) 01f3a2db Merge remote-tracking branch 'origin/master' into vision 3be6c975 SNLI_VE dataset reader (#4799) b659e665 VQAv2 (#4639) c787230c Merge remote-tracking branch 'origin/master' into vision db2d1d38 Merge branch 'master' into vision 6bf19246 Merge branch 'master' into vision 167bcaae remove vision push trigger 75914650 Merge remote-tracking branch 'origin/master' into vision 22d4633c improve independence of vision components (#4793) 98018cca fix merge conflicts c7803150 fix merge conflicts 5d22ce69 Merge remote-tracking branch 'origin/master' into vision 602399c0 update with master ffafaf64 Multitask data loading and scheduling (#4625) 7c47c3a5 Merge branch 'master' into vision 12c8d1bf Generalizing self attention (#4756) 63f61f0c Merge remote-tracking branch 'origin/master' into vision b48347be Merge remote-tracking branch 'origin/master' into vision 81892db4 fix failing tests 98edd253 update torch requirement 8da35081 update with master cc53afec separating TransformerPooler as a new module (#4730) 4ccfa885 Transformer toolkit: BiModalEncoder now has separate num_attention_heads for both modalities (#4728) 91631ef9 Transformer toolkit (#4577) 677a9cec Merge remote-tracking branch 'origin/master' into vision 2985236f This should have been part of the previously merged PR c5d264ae Detectron NLVR2 (#4481) e39a5f62 Merge remote-tracking branch 'origin/master' into vision f1e46fdc Add MultiTaskModel (#4601) fa22f731 Merge remote-tracking branch 'origin/master' into vision 41872ae4 Merge remote-tracking branch 'origin/master' into vision f886fd06 Merge remote-tracking branch 'origin/master' into vision 191b641e make existing readers work with multi-process loading (#4597) d7124d4b fix len calculation for new data loader (#4618) 87463612 Merge branch 'master' into vision 319794a1 remove duplicate padding calculations in collate fn (#4617) de9165e1 rename 'node_rank' to 'global_rank' in dataset reader 'DistributedInfo' (#4608) 3d114197 Formatting updates for new version of black (#4607) cde06e62 Changelog 1b08fd62 ensure models check runs on right branch 44c8791c ensure vision CI runs on each commit (#4582) 95e82532 Merge branch 'master' into vision e74a7365 new data loading (#4497) 6f820050 Merge remote-tracking branch 'origin/master' into vision a7d45de1 Initializing a VilBERT model from a pre-trained transformer (#4495) 3833f7a5 Merge branch 'master' into vision 71d7cb4e Merge branch 'master' into vision 31379611 Merge remote-tracking branch 'origin/master' into vision 6cc508d7 Merge branch 'master' into vision f87df839 Merge remote-tracking branch 'origin/master' into vision 0bbe84b4 An initial VilBERT model for NLVR2 (#4423)

    Source code(tar.gz)
    Source code(zip)
  • v1.3.0(Dec 15, 2020)

    What's new

    Added 🎉

    • Added links to source code in docs.
    • Added get_embedding_layer and get_text_field_embedder to the Predictor class; to specify embedding layers for non-AllenNLP models.
    • Added Gaussian Error Linear Unit (GELU) as an Activation.

    Changed ⚠️

    • Renamed module allennlp.data.tokenizers.token to allennlp.data.tokenizers.token_class to avoid this bug.
    • transformers dependency updated to version 4.0.1.

    Fixed ✅

    • Fixed a lot of instances where tensors were first created and then sent to a device with .to(device). Instead, these tensors are now created directly on the target device.
    • Fixed issue with GradientDescentTrainer when constructed with validation_data_loader=None and learning_rate_scheduler!=None.
    • Fixed a bug when removing all handlers in root logger.
    • ShardedDatasetReader now inherits parameters from base_reader when required.
    • Fixed an issue in FromParams where parameters in the params object used to a construct a class were not passed to the constructor if the value of the parameter was equal to the default value. This caused bugs in some edge cases where a subclass that takes **kwargs needs to inspect kwargs before passing them to its superclass.
    • Improved the band-aid solution for segmentation faults and the "ImportError: dlopen: cannot load any more object with static TLS" by adding a transformers import.
    • Added safety checks for extracting tar files

    Commits

    d408f416 log import errors for default plugins (#4866) f2a53310 Adds a safety check for tar files (#4858) 84a36a06 Update transformers requirement from <3.6,>=3.4 to >=4.0,<4.1 (#4831) fdad31aa Add ability to specify the embedding layer if the model does not use TextFieldEmbedder (#4836) 41c52245 Improve the band-aid solution for seg faults and the static TLS error (#4846) 63b6d163 fix FromParams bug (#4841) 6c3238ec rename token.py -> token_class.py (#4842) cec92098 Several micro optimizations (#4833) 48a48652 Add GELU activation (#4828) 3e623658 Bugfix for attribute inheritance in ShardedDatasetReader (#4830) 458c4c2b fix the way handlers are removed from the root logger (#4829) 5b306585 Fix bug in GradientDescentTrainer when validation data is absent (#4811) f353c6ce add link to source code in docs (#4807) 0a832713 No Docker auth on PRs (#4802) ad8e8a09 no ssh setup on PRs (#4801)

    Source code(tar.gz)
    Source code(zip)
  • v1.2.2(Nov 17, 2020)

    What's new

    Added 🎉

    • Added Docker builds for other torch-supported versions of CUDA.
    • Adds allennlp-semparse as an official, default plugin.

    Fixed ✅

    • GumbelSampler now sorts the beams by their true log prob.

    Commits

    023d9bcc Prepare for release v1.2.2 7b0826c1 push commit images for both CUDA versions 3cad5b41 fix AUC test (#4795) efde092d upgrade ssh-agent action (#4797) ec37dd46 Docker builds for other CUDA versions, improve CI (#4796) 0d8873cf doc link quickfix e4cc95ce improve plugin section in README (#4789) d99f7f8a ensure Gumbel sorts beams by true log prob (#4786) 9fe8d900 Makes the transformer cache work with custom kwargs (#4781) 1e7492d7 Update transformers requirement from <3.5,>=3.4 to >=3.4,<3.6 (#4784) f27ef38b Fixes pretrained embeddings for transformers that don't have end tokens (#4732)

    Source code(tar.gz)
    Source code(zip)
  • v1.2.1(Nov 11, 2020)

    What's new

    Added 🎉

    • Added an optional seed parameter to ModelTestCase.set_up_model which sets the random seed for random, numpy, and torch.
    • Added support for a global plugins file at ~/.allennlp/plugins.
    • Added more documentation about plugins.
    • Added sampler class and parameter in beam search for non-deterministic search, with several implementations, including MultinomialSampler, TopKSampler, TopPSampler, and GumbelMaxSampler. Utilizing GumbelMaxSampler will give Stochastic Beam Search.

    Changed ⚠️

    • Pass batch metrics to BatchCallback.

    Fixed ✅

    • Fixed a bug where forward hooks were not cleaned up with saliency interpreters if there was an exception.
    • Fixed the computation of saliency maps in the Interpret code when using mismatched indexing. Previously, we would compute gradients from the top of the transformer, after aggregation from wordpieces to tokens, which gives results that are not very informative. Now, we compute gradients with respect to the embedding layer, and aggregate wordpieces to tokens separately.
    • Fixed the heuristics for finding embedding layers in the case of RoBERTa. An update in the transformers library broke our old heuristic.
    • Fixed typo with registered name of ROUGE metric. Previously was rogue, fixed to rouge.
    • Fixed default masks that were erroneously created on the CPU even when a GPU is available.

    Commits

    04247faa support global plugins file, improve plugins docs (#4779) 9f7cc248 Add sampling strategies to beam search (#4768) f6fe8c6d pin urllib3 in dev reqs for responses (#4780) 764bbe2e Pass batch metrics to BatchCallback (#4764) dc3a4f67 clean up forward hooks on exception (#4778) fcc3a70b Fix: typo in metric, rogue -> rouge (#4777) b89320cd Set the device for an auto-created mask (#4774) 92a844a7 RoBERTa embeddings are no longer a type of BERT embeddings (#4771) 23f0a8a6 Ensure cnn_encoder respects masking (#4746) b4f1a7ab add seed option to ModelTestCase.set_up_model (#4769) b7cec515 Made Interpret code handle mismatched cases better (#4733) 9759b15f allow TextFieldEmbedder to have EmptyEmbedder that may not be in input (#4761)

    Source code(tar.gz)
    Source code(zip)
  • v1.2.0(Oct 29, 2020)

    What's new

    Changed ⚠️

    • Enforced stricter typing requirements around the use of Optional[T] types.
    • Changed the behavior of Lazy types in from_params methods. Previously, if you defined a Lazy parameter like foo: Lazy[Foo] = None in a custom from_params classmethod, then foo would actually never be None. This behavior is now different. If no params were given for foo, it will be None. You can also now set default values for foo like foo: Lazy[Foo] = Lazy(Foo). Or, if you want you want a default value but also want to allow for None values, you can write it like this: foo: Optional[Lazy[Foo]] = Lazy(Foo).
    • Added support for PyTorch version 1.7.

    Fixed ✅

    • Made it possible to instantiate TrainerCallback from config files.
    • Fixed the remaining broken internal links in the API docs.
    • Fixed a bug where Hotflip would crash with a model that had multiple TokenIndexers and the input used rare vocabulary items.
    • Fixed a bug where BeamSearch would fail if max_steps was equal to 1.

    Commits

    7f85c74e fix docker build (#4762) cc9ac0f2 ensure dataclasses not installed in CI (#4754) 812ac570 Fix hotflip bug where vocab items were not re-encoded correctly (#4759) aeb6d362 revert samplers and fix bug when max_steps=1 (#4760) baca7545 Make returning token type id default in transformers intra word tokenization. (#4758) 5d6670ce Update torch requirement from <1.7.0,>=1.6.0 to >=1.6.0,<1.8.0 (#4753) 0ad228d4 a few small doc fixes (#4752) 71a98c2a stricter typing for Optional[T] types, improve handling of Lazy params (#4743) 27edfbf8 Add end+trainer callbacks to Trainer.from_partial_objects (#4751) b792c834 Fix device mismatch bug for categorical accuracy metric in distributed training (#4744)

    Source code(tar.gz)
    Source code(zip)
  • v1.2.0rc1(Oct 22, 2020)

    What's new

    Added 🎉

    • Added a warning when batches_per_epoch for the validation data loader is inherited from the train data loader.
    • Added a build-vocab subcommand that can be used to build a vocabulary from a training config file.
    • Added tokenizer_kwargs argument to PretrainedTransformerMismatchedIndexer.
    • Added tokenizer_kwargs and transformer_kwargs arguments to PretrainedTransformerMismatchedEmbedder.
    • Added official support for Python 3.8.
    • Added a script: scripts/release_notes.py, which automatically prepares markdown release notes from the CHANGELOG and commit history.
    • Added a flag --predictions-output-file to the evaluate command, which tells AllenNLP to write the predictions from the given dataset to the file as JSON lines.
    • Added the ability to ignore certain missing keys when loading a model from an archive. This is done by adding a class-level variable called authorized_missing_keys to any PyTorch module that a Model uses. If defined, authorized_missing_keys should be a list of regex string patterns.
    • Added FBetaMultiLabelMeasure, a multi-label Fbeta metric. This is a subclass of the existing FBetaMeasure.
    • Added ability to pass additional key word arguments to cached_transformers.get(), which will be passed on to AutoModel.from_pretrained().
    • Added an overrides argument to Predictor.from_path().
    • Added a cached-path command.
    • Added a function inspect_cache to common.file_utils that prints useful information about the cache. This can also be used from the cached-path command with allennlp cached-path --inspect.
    • Added a function remove_cache_entries to common.file_utils that removes any cache entries matching the given glob patterns. This can used from the cached-path command with allennlp cached-path --remove some-files-*.
    • Added logging for the main process when running in distributed mode.
    • Added a TrainerCallback object to support state sharing between batch and epoch-level training callbacks.
    • Added support for .tar.gz in PretrainedModelInitializer.
    • Added classes: nn/samplers/samplers.py with MultinomialSampler, TopKSampler, and TopPSampler for sampling indices from log probabilities
    • Made BeamSearch registrable.
    • Added top_k_sampling and type_p_sampling BeamSearch implementations.
    • Pass serialization_dir to Model and DatasetReader.
    • Added an optional include_in_archive parameter to the top-level of configuration files. When specified, include_in_archive should be a list of paths relative to the serialization directory which will be bundled up with the final archived model from a training run.

    Changed ⚠️

    • Subcommands that don't require plugins will no longer cause plugins to be loaded or have an --include-package flag.
    • Allow overrides to be JSON string or dict.
    • transformers dependency updated to version 3.1.0.
    • When cached_path is called on a local archive with extract_archive=True, the archive is now extracted into a unique subdirectory of the cache root instead of a subdirectory of the archive's directory. The extraction directory is also unique to the modification time of the archive, so if the file changes, subsequent calls to cached_path will know to re-extract the archive.
    • Removed the truncation_strategy parameter to PretrainedTransformerTokenizer. The way we're calling the tokenizer, the truncation strategy takes no effect anyways.
    • Don't use initializers when loading a model, as it is not needed.
    • Distributed training will now automatically search for a local open port if the master_port parameter is not provided.
    • In training, save model weights before evaluation.
    • allennlp.common.util.peak_memory_mb renamed to peak_cpu_memory, and allennlp.common.util.gpu_memory_mb renamed to peak_gpu_memory, and they both now return the results in bytes as integers. Also, the peak_gpu_memory function now utilizes PyTorch functions to find the memory usage instead of shelling out to the nvidia-smi command. This is more efficient and also more accurate because it only takes into account the tensor allocations of the current PyTorch process.
    • Make sure weights are first loaded to the cpu when using PretrainedModelInitializer, preventing wasted GPU memory.
    • Load dataset readers in load_archive.
    • Updated AllenNlpTestCase docstring to remove reference to unittest.TestCase

    Removed 👋

    • Removed common.util.is_master function.

    Fixed ✅

    • Fixed a bug where the reported batch_loss metric was incorrect when training with gradient accumulation.
    • Class decorators now displayed in API docs.
    • Fixed up the documentation for the allennlp.nn.beam_search module.
    • Ignore *args when constructing classes with FromParams.
    • Ensured some consistency in the types of the values that metrics return.
    • Fix a PyTorch warning by explicitly providing the as_tuple argument (leaving it as its default value of False) to Tensor.nonzero().
    • Remove temporary directory when extracting model archive in load_archive at end of function rather than via atexit.
    • Fixed a bug where using cached_path() offline could return a cached resource's lock file instead of the cache file.
    • Fixed a bug where cached_path() would fail if passed a cache_dir with the user home shortcut ~/.
    • Fixed a bug in our doc building script where markdown links did not render properly if the "href" part of the link (the part inside the ()) was on a new line.
    • Changed how gradients are zeroed out with an optimization. See this video from NVIDIA at around the 9 minute mark.
    • Fixed a bug where parameters to a FromParams class that are dictionaries wouldn't get logged when an instance is instantiated from_params.
    • Fixed a bug in distributed training where the vocab would be saved from every worker, when it should have been saved by only the local master process.
    • Fixed a bug in the calculation of rouge metrics during distributed training where the total sequence count was not being aggregated across GPUs.
    • Fixed allennlp.nn.util.add_sentence_boundary_token_ids() to use device parameter of input tensor.
    • Be sure to close the TensorBoard writer even when training doesn't finish.
    • Fixed the docstring for PyTorchSeq2VecWrapper.

    Commits

    01644caf Pass serialization_dir to Model, DatasetReader, and support include_in_archive (#4713) 1f29f352 Update transformers requirement from <3.4,>=3.1 to >=3.1,<3.5 (#4741) 6bb9ce9a warn about batches_per_epoch with validation loader (#4735) 00bb6c59 Be sure to close the TensorBoard writer (#4731) 3f23938b Update mkdocs-material requirement from <6.1.0,>=5.5.0 to >=5.5.0,<6.2.0 (#4738) 10c11cea Fix typo in PretrainedTransformerMismatchedEmbedder docstring (#4737) 0e64b4d3 fix docstring for PyTorchSeq2VecWrapper (#4734) 006bab48 Don't use PretrainedModelInitializer when loading a model (#4711) ce14bdc0 Allow usage of .tar.gz with PretrainedModelInitializer (#4709) c14a056d avoid defaulting to CPU device in add_sentence_boundary_token_ids() (#4727) 24519fd9 fix typehint on checkpointer method (#4726) d3c69f75 Bump mypy from 0.782 to 0.790 (#4723) cccad29a Updated AllenNlpTestCase docstring (#4722) 3a85e359 add reasonable timeout to gpu checks job (#4719) 1ff0658c Added logging for the main process when running in distributed mode (#4710) b099b69c Add top_k and top_p sampling to BeamSearch (#4695) bc6f15ac Fixes rouge metric calculation corrected for distributed training (#4717) ae7cf85b automatically find local open port in distributed training (#4696) 321d4f48 TrainerCallback with batch/epoch/end hooks (#4708) 001e1f76 new way of setting env variables in GH Actions (#4700) c14ea40e Save checkpoint before running evaluation (#4704) 40bb47ad Load weights to cpu with PretrainedModelInitializer (#4712) 327188b8 improve memory helper functions (#4699) 90f00379 fix reported batch_loss (#4706) 39ddb523 CLI improvements (#4692) edcb6d34 Fix a bug in saving vocab during distributed training (#4705) 3506e3fd ensure parameters that are actual dictionaries get logged (#4697) eb7f2568 Add StackOverflow link to README (#4694) 17c3b84b Fix small typo (#4686) e0b2e265 display class decorators in API docs (#4685) b9a92842 Update transformers requirement from <3.3,>=3.1 to >=3.1,<3.4 (#4684) d9bdaa95 add build-vocab command (#4655) ce604f1f Update mkdocs-material requirement from <5.6.0,>=5.5.0 to >=5.5.0,<6.1.0 (#4679) c3b5ed74 zero grad optimization (#4673) 9dabf3fa Add missing tokenizer/transformer kwargs (#4682) 9ac6c76c Allow overrides to be JSON string or dict (#4680) 55cfb47b The truncation setting doesn't do anything anymore (#4672) 990c9c17 clarify conda Python version in README.md 97db5387 official support for Python 3.8 🐍 (#4671) 1e381bb0 Clean up the documentation for beam search (#4664) 11def8ea Update bug_report.md 97fe88d2 Cached path command (#4652) c9f376bf Update transformers requirement from <3.2,>=3.1 to >=3.1,<3.3 (#4663) e5e3d020 tick version for nightly releases b833f905 fix multi-line links in docs (#4660) d7c06fe7 Expose from_pretrained keyword arguments (#4651) 175c76be fix confusing distributed logging info (#4654) fbd2ccca fix numbering in RELEASE_GUIDE 2d5f24bd improve how cached_path extracts archives (#4645) 824f97d4 smooth out release process (#4648) c7b7c008 Feature/prevent temp directory retention (#4643) de5d68bc Fix tensor.nonzero() function overload warning (#4644) e8e89d5a add flag for saving predictions to 'evaluate' command (#4637) e4fd5a0c Multi-label F-beta metric (#4562) f0e7a78c Create Dependabot config file (#4635) 0e33b0ba Return consistent types from metrics (#4632) 2df364ff Update transformers requirement from <3.1,>=3.0 to >=3.0,<3.2 (#4621) 6d480aae Improve handling of **kwargs in FromParams (#4629) bf3206a2 Workaround for Python not finding imports in spawned processes (#4630)

    Source code(tar.gz)
    Source code(zip)
  • v1.1.0(Sep 8, 2020)

    Highlights

    Version 1.1 was mainly focused on bug fixes, but there are a few important new features such as gradient checkpointing with pretrained transformer embedders and official support for automatic mixed precision (AMP) training through the new torch.amp module.

    Details

    Added

    • Predictor.capture_model_internals() now accepts a regex specifying which modules to capture.
    • Added the option to specify requires_grad: false within an optimizer's parameter groups.
    • Added the file-friendly-logging flag back to the train command. Also added this flag to the predict, evaluate, and find-learning-rate commands.
    • Added an EpochCallback to track current epoch as a model class member.
    • Added the option to enable or disable gradient checkpointing for transformer token embedders via boolean parameter gradient_checkpointing.
    • Added a method to ModelTestCase for running basic model tests when you aren't using config files.
    • Added some convenience methods for reading files.
    • cached_path() can now automatically extract and read files inside of archives.
    • Added the ability to pass an archive file instead of a local directory to Vocab.from_files.
    • Added the ability to pass an archive file instead of a glob to ShardedDatasetReader.
    • Added a new "linear_with_warmup" learning rate scheduler.
    • Added a check in ShardedDatasetReader that ensures the base reader doesn't implement manual distributed sharding itself.
    • Added an option to PretrainedTransformerEmbedder and PretrainedTransformerMismatchedEmbedder to use a scalar mix of all hidden layers from the transformer model instead of just the last layer. To utilize this, just set last_layer_only to False.
    • Training metrics now include batch_loss and batch_reg_loss in addition to aggregate loss across number of batches.

    Changed

    • Upgraded PyTorch requirement to 1.6.
    • Beam search now supports multi-layer decoders.
    • Replaced the NVIDIA Apex AMP module with torch's native AMP module. The default trainer (GradientDescentTrainer) now takes a use_amp: bool parameter instead of the old opt_level: str parameter.
    • Not specifying a cuda_device now automatically determines whether to use a GPU or not.
    • Discovered plugins are logged so you can see what was loaded.
    • allennlp.data.DataLoader is now an abstract registrable class. The default implementation remains the same, but was renamed to allennlp.data.PyTorchDataLoader.
    • BertPooler can now unwrap and re-wrap extra dimensions if necessary.

    Removed

    • Removed the opt_level parameter to Model.load and load_archive. In order to use AMP with a loaded model now, just run the model's forward pass within torch's autocast context.

    Fixed

    • Fixed handling of some edge cases when constructing classes with FromParams where the class accepts **kwargs.
    • Fixed division by zero error when there are zero-length spans in the input to a PretrainedTransformerMismatchedIndexer.
    • Improved robustness of cached_path when extracting archives so that the cache won't be corrupted if a failure occurs during extraction.
    • Fixed a bug with the average and evalb_bracketing_score metrics in distributed training.
    • Fixed a bug in distributed metrics that caused nan values due to repeated addition of an accumulated variable.
    • Fixed how truncation was handled with PretrainedTransformerTokenizer. Previously, if max_length was set to None, the tokenizer would still do truncation if the transformer model had a default max length in its config. Also, when max_length was set to a non-None value, several warnings would appear for certain transformer models around the use of the truncation parameter.
    • Fixed evaluation of all metrics when using distributed training.
    • Added a py.typed marker. Fixed type annotations in allennlp.training.util.
    • Fixed problem with automatically detecting whether tokenization is necessary. This affected primarily the Roberta SST model.
    • Improved help text for using the --overrides command line flag.
    • Removed unnecessary warning about deadlocks in DataLoader.
    • Fixed testing models that only return a loss when they are in training mode.
    • Fixed a bug in FromParams that caused silent failure in case of the parameter type being Optional[Union[...]].
    • Fixed a bug where the program crashes if evaluation_data_loader is a AllennlpLazyDataset.
    • Reduced the amount of log messages produced by allennlp.common.file_utils.
    • Fixed a bug where PretrainedTransformerEmbedder parameters appeared to be trainable in the log output even when train_parameters was set to False.
    • Fixed a bug with the sharded dataset reader where it would only read a fraction of the instances in distributed training.
    • Fixed checking equality of ArrayFields.
    • Fixed a bug where NamespaceSwappingField did not work correctly with .empty_field().
    • Put more sensible defaults on the huggingface_adamw optimizer.
    • Simplified logging so that all logging output always goes to one file.
    • Fixed interaction with the python command line debugger.
    • Log the grad norm properly even when we're not clipping it.
    • Fixed a bug where PretrainedModelInitializer fails to initialize a model with a 0-dim tensor
    • Fixed a bug with the layer unfreezing schedule of the SlantedTriangular learning rate scheduler.
    • Fixed a regression with logging in the distributed setting. Only the main worker should write log output to the terminal.
    • Pinned the version of boto3 for package managers (e.g. poetry).
    • Fixed issue #4330 by updating the tokenizers dependency.
    • Fixed a bug in TextClassificationPredictor so that it passes tokenized inputs to the DatasetReader in case it does not have a tokenizer.
    • reg_loss is only now returned for models that have some regularization penalty configured.
    • Fixed a bug that prevented cached_path from downloading assets from GitHub releases.
    • Fixed a bug that erroneously increased last label's false positive count in calculating fbeta metrics.
    • Tqdm output now looks much better when the output is being piped or redirected.
    • Small improvements to how the API documentation is rendered.
    • Only show validation progress bar from main process in distributed training.

    Commits

    dcc9cdc7 Prepare for release v1.1.0 aa750bec fix Average metric (#4624) e1aa57cf improve robustness of cached_path when extracting archives (#4622) 711afaa7 Fix division by zero when there are zero-length spans in MismatchedEmbedder. (#4615) be97943a Improve handling of **kwargs in FromParams (#4616) 187b24e5 add more tutorial links to README (#4613) e840a589 s/logging/logger/ (#4609) dbc3c3ff Added batched versions of scatter and fill to util.py (#4598) 2c54cf8b reformat for new version of black (#4605) 2dd335e4 batched_span_select now guarantees element order in each span (#4511) 62f554ff specify module names by a regex in predictor.capture_model_internals() (#4585) f464aa38 Bump markdown-include from 0.5.1 to 0.6.0 (#4586) d01cdff9 Update RELEASE_PROCESS.md to include allennlp-models (#4587) 3aedac97 Prepare for release v1.1.0rc4 87a61ad9 Bug fix in distributed metrics (#4570) 71a9a90d upgrade actions to [email protected] (#4573) bd9ee6a4 Give better usage info for overrides parameter (#4575) 0a456a75 Fix boolean and categorical accuracy for distributed (#4568) 85112746 add actions workflow for closing stale issues (#4561) de413065 Static type checking fixes (#4545) 5a07009b Fix RoBERTa SST (#4548) 351941f3 Only pin mkdocs-material to minor version, ignore specific patch version (#4556) 0ac13a4f fix CHANGELOG 3b86f588 Prepare for release v1.1.0rc3 44d28476 Metrics in distributed setting (#4525) 1d619659 Bump mkdocs-material from 5.5.3 to 5.5.5 (#4547) 5b977809 tick version for nightly releases b32608e3 add gradient checkpointing for transformer token embedders (#4544) f639336a Fix logger being created twice (#4538) 660fdaf2 Fix handling of max length with transformer tokenizers (#4534) 15e288f5 EpochCallBack for tracking epoch (#4540) 9209bc91 Bump mkdocs-material from 5.5.0 to 5.5.3 (#4533) bfecdc3e Ensure len(self.evaluation_data_loader) is not called (#4531) 5bc3b732 Fix typo in warning in file_utils (#4527) e80d7687 pin torch >= 1.6 73220d71 Prepare for release v1.1.0rc2 9415350d Update torch requirement from <1.6.0,>=1.5.0 to >=1.5.0,<1.7.0 (#4519) 146bd9ee Remove link to self-attention modules. (#4512) 24012823 add back file-friendly-logging flag (#4509) 54e5c83e closes #4494 (#4508) fa39d498 ensure call methods are rendered in docs (#4522) e53d1858 Bug fix for case when param type is Optional[Union...] (#4510) 14f63b77 Make sure we have a bool tensor where we expect one (#4505) 18a4eb34 add a requires_grad option to param groups (#4502) 6c848dfb Bump mkdocs-material from 5.4.0 to 5.5.0 (#4507) d73f8a91 More BART changes (#4500) 1cab3bfe Update beam_search.py (#4462) 478bf46c remove deadlock warning in DataLoader (#4487) 714334ad Fix reported loss: Bug fix in batch_loss (#4485) db20b1fb use longer tqdm intervals when output being redirected (#4488) 53eeec10 tick version for nightly releases d693cf1c PathLike (#4479) 2f878322 only show validation progress bar from main process (#4476) 9144918d Fix reported loss (#4477) 5c970833 fix release link in CHANGELOG and formatting in README 4eb97953 Prepare for release v1.1.0rc1 f195440b update 'Models' links in README (#4475) 9c801a3c add CHANGELOG to API docs, point to license on GitHub, improve API doc formatting (#4472) 69d2f03d Clean up Tqdm bars when output is being piped or redirected (#4470) 7b188c93 fixed bug that erronously increased last label's false positive count (#4473) 64db027d Skip ETag check if OSError (#4469) b9d011ef More BART changes (#4468) 7a563a8f add option to use scalar mix of all transformer layers (#4460) d00ad668 Minor tqdm and logging clean up (#4448) 6acf2058 Fix regloss logging (#4449) 8c32ddfd Fixing bug in TextClassificationPredictor so that it passes tokenized inputs to the DatasetReader (#4456) b9a91646 Update transformers requirement from <2.12,>=2.10 to >=2.10,<3.1 (#4446) 181ef5d2 pin boto3 to resolve some dependency issues (#4453) c75a1ebd ensure base reader of ShardedDatasetReader doesn't implement sharding itself (#4454) 8a05ad43 Update CONTRIBUTING.md (#4447) 5b988d63 ensure only rank 0 worker writes to terminal (#4445) 8482f022 fix bug with SlantedTriangular LR scheduler (#4443) e46a578e Update transformers requirement from <2.11,>=2.10 to >=2.10,<2.12 (#4411) 8229aca3 Fix pretrained model initialization (#4439) 60deece9 Fix type hint in text_field.py (#4434) 23e549e4 More multiple-choice changes (#4415) 6d0a4fd2 generalize DataLoader (#4416) acd99952 Automatic file-friendly logging (#4383) 637dbb15 fix README, pin mkdocs, update mkdocs-material (#4412) 9c4dfa54 small fix to pretrained transformer tokenizer (#4417) 84988b81 Log plugins discovered and filter out transformers "PyTorch version ... available" log message (#4414) 54c41fcc Adds the ability to automatically detect whether we have a GPU (#4400) 96ff5851 Changes from my multiple-choice work (#4368) eee15ca8 Assign an empty mapping array to empty fields of NamespaceSwappingField (#4403) aa2943e5 Bump mkdocs-material from 5.3.2 to 5.3.3 (#4398) 7fa7531c fix eq method of ArrayField (#4401) e104e441 Add test to ensure data loader yields all instances when batches_per_epoch is set (#4394) b6fd6978 fix sharded dataset reader (#4396) 30e5dbfc Bump mypy from 0.781 to 0.782 (#4395) b0ba2d4c update version 1d07cc75 Bump mkdocs-material from 5.3.0 to 5.3.2 (#4389) ffc51843 ensure Vocab.from_files and ShardedDatasetReader can handle archives (#4371) 20afe6ce Add Optuna integrated badge to README.md (#4361) ba79f146 Bump mypy from 0.780 to 0.781 (#4390) 85e531c2 Update README.md (#4385) c2ecb7a2 Add a method to ModelTestCase for use without config files (#4381) 6852deff pin some doc building requirements (#4386) bf422d56 Add github template for using your own python run script (#4380) ebde6e85 Bump overrides from 3.0.0 to 3.1.0 (#4375) e52b7518 ensure transformer params are frozen at initialization when train_parameters is false (#4377) 3e8a9ef6 Add link to new template repo for config file development (#4372) 4f70bc93 tick version for nightly releases 63a5e158 Update spacy requirement from <2.3,>=2.1.0 to >=2.1.0,<2.4 (#4370) ef7c75b8 reduce amount of log messages produced by file_utils (#4366)

    Source code(tar.gz)
    Source code(zip)
  • v1.1.0rc4(Aug 20, 2020)

    Changes since v1.1.0rc3

    Added

    • Added a workflow to GitHub Actions that will automatically close unassigned stale issues and ping the assignees of assigned stale issues.

    Fixed

    • Fixed a bug in distributed metrics that caused nan values due to repeated addition of an accumulated variable.

    Commits

    87a61ad9 Bug fix in distributed metrics (#4570) 71a9a90d upgrade actions to [email protected] (#4573) bd9ee6a4 Give better usage info for overrides parameter (#4575) 0a456a75 Fix boolean and categorical accuracy for distributed (#4568) 85112746 add actions workflow for closing stale issues (#4561) de413065 Static type checking fixes (#4545) 5a07009b Fix RoBERTa SST (#4548) 351941f3 Only pin mkdocs-material to minor version, ignore specific patch version (#4556)

    Source code(tar.gz)
    Source code(zip)
  • v1.1.0rc3(Aug 12, 2020)

    Changes since v1.1.0rc2

    Fixed

    • Fixed how truncation was handled with PretrainedTransformerTokenizer. Previously, if max_length was set to None, the tokenizer would still do truncation if the transformer model had a default max length in its config. Also, when max_length was set to a non-None value, several warnings would appear for certain transformer models around the use of the truncation parameter.
    • Fixed evaluation of all metrics when using distributed training.

    Commits

    0ac13a4f fix CHANGELOG 3b86f588 Prepare for release v1.1.0rc3 44d28476 Metrics in distributed setting (#4525) 1d619659 Bump mkdocs-material from 5.5.3 to 5.5.5 (#4547) 5b977809 tick version for nightly releases b32608e3 add gradient checkpointing for transformer token embedders (#4544) f639336a Fix logger being created twice (#4538) 660fdaf2 Fix handling of max length with transformer tokenizers (#4534) 15e288f5 EpochCallBack for tracking epoch (#4540) 9209bc91 Bump mkdocs-material from 5.5.0 to 5.5.3 (#4533) bfecdc3e Ensure len(self.evaluation_data_loader) is not called (#4531) 5bc3b732 Fix typo in warning in file_utils (#4527) e80d7687 pin torch >= 1.6

    Source code(tar.gz)
    Source code(zip)
  • v1.1.0rc2(Jul 31, 2020)

    What's new since v1.1.0rc1

    Changed

    • Upgraded PyTorch requirement to 1.6.
    • Replaced the NVIDIA Apex AMP module with torch's native AMP module. The default trainer (GradientDescentTrainer) now takes a use_amp: bool parameter instead of the old opt_level: str parameter.

    Fixed

    • Removed unnecessary warning about deadlocks in DataLoader.
    • Fixed testing models that only return a loss when they are in training mode.
    • Fixed a bug in FromParams that caused silent failure in case of the parameter type being Optional[Union[...]].

    Added

    • Added the option to specify requires_grad: false within an optimizer's parameter groups.
    • Added the file-friendly-logging flag back to the train command. Also added this flag to the predict, evaluate, and find-learning-rate commands.

    Removed

    • Removed the opt_level parameter to Model.load and load_archive. In order to use AMP with a loaded model now, just run the model's forward pass within torch's autocast context.

    Commits

    73220d71 Prepare for release v1.1.0rc2 9415350d Update torch requirement from <1.6.0,>=1.5.0 to >=1.5.0,<1.7.0 (#4519) 146bd9ee Remove link to self-attention modules. (#4512) 24012823 add back file-friendly-logging flag (#4509) 54e5c83e closes #4494 (#4508) fa39d498 ensure call methods are rendered in docs (#4522) e53d1858 Bug fix for case when param type is Optional[Union...] (#4510) 14f63b77 Make sure we have a bool tensor where we expect one (#4505) 18a4eb34 add a requires_grad option to param groups (#4502) 6c848dfb Bump mkdocs-material from 5.4.0 to 5.5.0 (#4507) d73f8a91 More BART changes (#4500) 1cab3bfe Update beam_search.py (#4462) 478bf46c remove deadlock warning in DataLoader (#4487) 714334ad Fix reported loss: Bug fix in batch_loss (#4485) db20b1fb use longer tqdm intervals when output being redirected (#4488) 53eeec10 tick version for nightly releases d693cf1c PathLike (#4479) 2f878322 only show validation progress bar from main process (#4476) 9144918d Fix reported loss (#4477) 5c970833 fix release link in CHANGELOG and formatting in README

    Source code(tar.gz)
    Source code(zip)
  • v1.1.0rc1(Jul 14, 2020)

    This is the first pre-release candidate for version 1.1. There will probably be at least more candidate before the true 1.1 release.

    What's new since v1.0.0

    Fixed

    • Reduced the amount of log messages produced by allennlp.common.file_utils.
    • Fixed a bug where PretrainedTransformerEmbedder parameters appeared to be trainable in the log output even when train_parameters was set to False.
    • Fixed a bug with the sharded dataset reader where it would only read a fraction of the instances in distributed training.
    • Fixed checking equality of ArrayFields.
    • Fixed a bug where NamespaceSwappingField did not work correctly with .empty_field().
    • Put more sensible defaults on the huggingface_adamw optimizer.
    • Simplified logging so that all logging output always goes to one file.
    • Fixed interaction with the python command line debugger.
    • Log the grad norm properly even when we're not clipping it.
    • Fixed a bug where PretrainedModelInitializer fails to initialize a model with a 0-dim tensor
    • Fixed a bug with the layer unfreezing schedule of the SlantedTriangular learning rate scheduler.
    • Fixed a regression with logging in the distributed setting. Only the main worker should write log output to the terminal.
    • Pinned the version of boto3 for package managers (e.g. poetry).
    • Fixed issue #4330 by updating the tokenizers dependency.
    • Fixed a bug in TextClassificationPredictor so that it passes tokenized inputs to the DatasetReader in case it does not have a tokenizer.
    • reg_loss is only now returned for models that have some regularization penalty configured.
    • Fixed a bug that prevented cached_path from downloading assets from GitHub releases.
    • Fixed a bug that erronously increased last label's false positive count in calculating fbeta metrics.
    • Tqdm output now looks much better when the output is being piped or redirected.
    • Small improvements to how the API documentation is rendered.

    Added

    • A method to ModelTestCase for running basic model tests when you aren't using config files.
    • Added some convenience methods for reading files.
    • Added an option to file_utils.cached_path to automatically extract archives.
    • Added the ability to pass an archive file instead of a local directory to Vocab.from_files.
    • Added the ability to pass an archive file instead of a glob to ShardedDatasetReader.
    • Added a new "linear_with_warmup" learning rate scheduler.
    • Added a check in ShardedDatasetReader that ensures the base reader doesn't implement manual distributed sharding itself.
    • Added an option to PretrainedTransformerEmbedder and PretrainedTransformerMismatchedEmbedder to use a scalar mix of all hidden layers from the transformer model instead of just the last layer. To utilize this, just set last_layer_only to False.
    • cached_path() can now read files inside of archives.

    Changed

    • Not specifying a cuda_device now automatically determines whether to use a GPU or not.
    • Discovered plugins are logged so you can see what was loaded.
    • allennlp.data.DataLoader is now an abstract registrable class. The default implementation remains the same, but was renamed to allennlp.data.PyTorchDataLoader.
    • BertPooler can now unwrap and re-wrap extra dimensions if necessary.
    • New transformers dependency. Only version >=3.0 now supported.

    Commits

    4eb97953 Prepare for release v1.1.0rc1 f195440b update 'Models' links in README (#4475) 9c801a3c add CHANGELOG to API docs, point to license on GitHub, improve API doc formatting (#4472) 69d2f03d Clean up Tqdm bars when output is being piped or redirected (#4470) 7b188c93 fixed bug that erronously increased last label's false positive count (#4473) 64db027d Skip ETag check if OSError (#4469) b9d011ef More BART changes (#4468) 7a563a8f add option to use scalar mix of all transformer layers (#4460) d00ad668 Minor tqdm and logging clean up (#4448) 6acf2058 Fix regloss logging (#4449) 8c32ddfd Fixing bug in TextClassificationPredictor so that it passes tokenized inputs to the DatasetReader (#4456) b9a91646 Update transformers requirement from <2.12,>=2.10 to >=2.10,<3.1 (#4446) 181ef5d2 pin boto3 to resolve some dependency issues (#4453) c75a1ebd ensure base reader of ShardedDatasetReader doesn't implement sharding itself (#4454) 8a05ad43 Update CONTRIBUTING.md (#4447) 5b988d63 ensure only rank 0 worker writes to terminal (#4445) 8482f022 fix bug with SlantedTriangular LR scheduler (#4443) e46a578e Update transformers requirement from <2.11,>=2.10 to >=2.10,<2.12 (#4411) 8229aca3 Fix pretrained model initialization (#4439) 60deece9 Fix type hint in text_field.py (#4434) 23e549e4 More multiple-choice changes (#4415) 6d0a4fd2 generalize DataLoader (#4416) acd99952 Automatic file-friendly logging (#4383) 637dbb15 fix README, pin mkdocs, update mkdocs-material (#4412) 9c4dfa54 small fix to pretrained transformer tokenizer (#4417) 84988b81 Log plugins discovered and filter out transformers "PyTorch version ... available" log message (#4414) 54c41fcc Adds the ability to automatically detect whether we have a GPU (#4400) 96ff5851 Changes from my multiple-choice work (#4368) eee15ca8 Assign an empty mapping array to empty fields of NamespaceSwappingField (#4403) aa2943e5 Bump mkdocs-material from 5.3.2 to 5.3.3 (#4398) 7fa7531c fix eq method of ArrayField (#4401) e104e441 Add test to ensure data loader yields all instances when batches_per_epoch is set (#4394) b6fd6978 fix sharded dataset reader (#4396) 30e5dbfc Bump mypy from 0.781 to 0.782 (#4395) b0ba2d4c update version 1d07cc75 Bump mkdocs-material from 5.3.0 to 5.3.2 (#4389) ffc51843 ensure Vocab.from_files and ShardedDatasetReader can handle archives (#4371) 20afe6ce Add Optuna integrated badge to README.md (#4361) ba79f146 Bump mypy from 0.780 to 0.781 (#4390) 85e531c2 Update README.md (#4385) c2ecb7a2 Add a method to ModelTestCase for use without config files (#4381) 6852deff pin some doc building requirements (#4386) bf422d56 Add github template for using your own python run script (#4380) ebde6e85 Bump overrides from 3.0.0 to 3.1.0 (#4375) e52b7518 ensure transformer params are frozen at initialization when train_parameters is false (#4377) 3e8a9ef6 Add link to new template repo for config file development (#4372) 4f70bc93 tick version for nightly releases 63a5e158 Update spacy requirement from <2.3,>=2.1.0 to >=2.1.0,<2.4 (#4370) ef7c75b8 reduce amount of log messages produced by file_utils (#4366)

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0(Jun 16, 2020)

    The 1.0 version of AllenNLP is the culmination of more than 500 commits over the course of several months of work from our engineering team. The AllenNLP library has had wide-reaching appeal so far in its lifetime, and this 1.0 release represents an important maturity milestone. While we will continue to move fast to keep up with the ever-changing state of the art, we will be increasingly conscious of the effect future API changes have on our existing user base.

    This release touches almost every aspect of the library, ranging from improving documentation to adding new natural-language processing components, to adjusting our APIs so they serve the community for the long haul. While we cannot summarize everything in these release notes, here are some of the main milestones for the 1.0 release.

    1. We are releasing several new models, such as: a. TransformerQA, a reading comprehension model (paper, demo) b. An improved coreference model, with a 17% absolute improvement (architecture paper/embedder paper, demo) c. The NMN reading comprehension model (paper, demo) d. The RoBERTa models for textual entailment, or NLI (paper, demo)

    2. We have new introductory material in the form of an interactive guide, showing how to use library components and our experiment framework. The guide's goal is to provide a comprehensive introduction to AllenNLP for people with a good understanding of machine learning, Python, and some PyTorch.

    3. We have improved performance across the library. a. Switching to native PyTorch data loading, which is not only much faster but also allows the three main parts of the library (data, model, and training) to interoperate with any native PyTorch code. b. Enabled support for 16-bit floating point through Apex. c. Multi-GPU training now utilizes a separate Python process for each GPU. These workers communicate using PyTorch's distributed module. This is more efficient than the old system which used a single Python process and was therefore limited by the GIL.

    4. We separated our models into a model repository (allennlp-models), so we have a lean core library with fewer dependencies.

    5. We dramatically simplified how AllenNLP code corresponds to AllenNLP configuration files, which also makes the library easy to use from raw Python.

    But changes are not limited to these. Some other highlights are that we have:

    1. Support for gradient accumulation.
    2. Improved configurability of the trainer so you can inject your own call on each batch.
    3. Seamless support for using word-piece tokenization on pre-tokenized text.
    4. A sampler that creates batches with roughly equal numbers of tokens.
    5. Unified support for Huggingface's transformer library.
    6. Support for token type IDs throughout the library.
    7. Nightly releases of the library to pip.
    8. BLEU and ROUGE metrics.

    Updates since v1.0.0rc6

    Fixed

    • Lazy dataset readers now work correctly with multi-process data loading.
    • Fixed race conditions that could occur when using a dataset cache.

    Added

    • A bug where where all datasets would be loaded for vocab creation even if not needed.
    • A parameter to the DatasetReader class: manual_multi_process_sharding. This is similar to the manual_distributed_sharding parameter, but applies when using a multi-process DataLoader.

    Commits

    29f3b6c3 Prepare for release v1.0.0 a8b840df fix some formatting issues in README (#4365) d3ed6197 fix Makefile c5549105 quick doc fixes (#4364) b764bef5 simplify dataset classes, fix multi-process lazy loading (#4344) 884a6149 Bump mkdocs-material from 5.2.3 to 5.3.0 (#4359) 6a124d80 ensure 'from_files' vocab doesn't load instances (#4356) 87c23e4a Fix handling of "datasets_for_vocab_creation" param (#4350) c3755d16 update CHANGELOG

    Upgrade guide from v0.9.0

    There are too many changes to be exhaustive, but here is a list of the most common issues:

    • You can continue to use the allennlp command line, but if you want to invoke it through Python, use python -m allennlp <command> instead of python -m allennlp.run <command>.
    • "bert_adam" is now "adamw".
    • We no longer support the "gradient_accumulation_batch_size" parameter to the trainer. Use "num_gradient_accumulation_steps" instead.

    Using the transformers library

    AllenNLP 1.0 replaces the mash-mash of transformer libraries and dependencies that we had in v0.9.0, and replaces it with one implementation that uses https://github.com/huggingface/transformers under the hood. For cases where you can work directly with the word pieces that are used by the transformers, use "pretrained_transformer" for tokenizers, indexers, and embedders. If you want to use tokens from pre-tokenized text, use ""pretrained_transformer_mismatched". The latter turns the text into word pieces, embeds them with the transformer, and then combines word pieces to produce an embedding for the original tokens.

    The parameters requires_grad and top_layer_only are no longer supported. If you are converting an old model that used to use "bert-pretrained", this is important! requires_grad used to be False by default, so it would not train the transformer itself. This saves memory and time at the cost of performance. The new code does not support this setting, and will always train the transformer. You can prevent this by setting requires_grad to False in a parameter group when setting up the optimizer.

    You no longer need to specify do_lowercase, as this is handled automatically now.

    Config file changes

    In 1.0, we simplified how FromParams works. As a result, some things in the config files need to change to work with 1.0:

    • The way Vocabulary options are specified in config files has changed. See #3550. If you want to load a vocabulary from files, you should specify "type": "from_files", and use the key "directory" instead of "directory_path".
    • When instantiating a BasicTextFieldEmbedder from_params, you used to be able to have embedder names be top-level keys in the config file (e.g., "embedder": {"elmo": ELMO_PARAMS, "tokens": TOKEN_PARAMS}). We changed this a long time ago to prefer wrapping them in a "token_embedders" key, and this is now required (e.g., "embedder": {"token_embedders": {"elmo": ELMO_PARAMS, "tokens": TOKEN_PARAMS}}).
    • The TokenCharactersEncoder now requires you to specify the vocab_namespace for the underlying embedder. It used to default to "token_characters", matching the TokenCharactersIndexer default, but making that work required some custom magic that wasn't worth the complexity. So instead of "token_characters": {"type": "character_encoding", "embedding": {"embedding_dim": 25}, "encoder": {...}}, you need to change this to: "token_characters": {"type": "character_encoding", "embedding": {"embedding_dim": 25, "vocab_namespace": "token_characters"}, "encoder": {...}}
    • Regularization now needs another key in a config file. Instead of specifying regularization as "regularizer": [[regex1, regularizer_params], [regex2, regularizer_params]], it now must be specified as "regularizer": {"regexes": [[regex1, regularizer_params], [regex2, regularizer_params]]}.
    • We changed initialization in a similar way to regularization. Instead of specifying initialization as "initializer": [[regex1, initializer_params], [regex2, initializer_params]], it now must be specified as "initializer": {"regexes": [[regex1, initializer_params], [regex2, initializer_params]]}. Also, you used to be able to have initializer_params be "prevent", to prevent initialization of matching parameters. This is now done with a separate key passed to the initializer: `"initializer": {"regexes": [..], "prevent_regexes": [regex1, regex2]}.
    • num_serialized_models_to_keep and keep_serialized_model_every_num_seconds used to be able to be passed as top-level parameters to the trainer, but now they must always be passed to the checkpointer instead. For example, if you had "trainer": {"num_serialized_models_to_keep": 1}, it now needs to be "trainer": {"checkpointer": {"num_serialized_models_to_keep": 1}}. Also, the default for that setting is now 2, so AllenNLP will no longer fill up your hard drive!
    • Tokenizer specification changed because of #3361. Instead of something like "tokenizer": {"word_splitter": {"type": "spacy"}}, you now just do "tokenizer": {"type": "spacy"} (more technically: the WordTokenizer has now been removed, with the things we used to call WordSplitters now just moved up to be top-level Tokenizers themselves).
    • The namespace_to_cache argument to ElmoTokenEmbedder has been removed as a config file option. You can still pass vocab_to_cache to the constructor of this class, but this functionality is no longer available from a config file. If you used this and are really put out by this change, let us know, and we'll see what we can do.

    Iterators ➔ DataLoaders

    Allennlp now uses PyTorch's API for data iteration, rather than our own custom one. This means that train_data, validation_data, iterator and validation_iterator arguments to the Trainer have been removed and replaced with data_loader and validation_dataloader.

    Previous config files which looked like:

    {
      "iterator": {
        "type": "bucket",
        "sorting_keys": [["tokens"], ["num_tokens"]],
        "padding_noise": 0.1
        ...
      }
    }
    

    Now become:

    {
      "data_loader": {
        "batch_sampler" {
          "type": "bucket",
          // sorting keys are no longer required! They can be inferred automatically.
          "padding_noise": 0.1
          ...
        }
      }
    }
    

    Multi-GPU

    Allennlp now uses DistributedDataParallel for parallel training, rather than DataParallel. With DistributedDataParallel, each worker (GPU) runs in it's own process. As such, each process also has its own Trainer, which now takes a single GPU ID only.

    Previous config files which looked like:

    {
      "trainer": {
        "cuda_device": [0, 1, 2, 3],
        "num_epochs": 20,
        ...
      }
    }
    

    Now become:

    {
      "distributed": {
        "cuda_devices": [0, 1, 2, 3],
      },
      "trainer": {
        "num_epochs": 20,
        ...
      }
    }
    
    Source code(tar.gz)
    Source code(zip)
  • v1.0.0rc6(Jun 11, 2020)

    Fixed

    • A bug where TextFields could not be duplicated since some tokenizers cannot be deep-copied. See https://github.com/allenai/allennlp/issues/4270.
    • Our caching mechanism had the potential to introduce race conditions if multiple processes were attempting to cache the same file at once. This was fixed by using a lock file tied to each cached file.
    • get_text_field_mask() now supports padding indices that are not 0.
    • A bug where predictor.get_gradients() would return an empty dictionary if an embedding layer had trainable set to false
    • Fixes PretrainedTransformerMismatchedIndexer in the case where a token consists of zero word pieces.
    • Fixes a bug when using a lazy dataset reader that results in a UserWarning from PyTorch being printed at every iteration during training.
    • Predictor names were inconsistently switching between dashes and underscores. Now they all use underscores.
    • Predictor.from_path now automatically loads plugins (unless you specify load_plugins=False) so that you don't have to manually import a bunch of modules when instantiating predictors from an archive path.
    • allennlp-server automatically found as a plugin once again.

    Added

    • A duplicate() method on Instances and Fields, to be used instead of copy.deepcopy()
    • A batch sampler that makes sure each batch contains approximately the same number of tokens (MaxTokensBatchSampler)
    • Functions to turn a sequence of token indices back into tokens
    • The ability to use Huggingface encoder/decoder models as token embedders
    • Improvements to beam search
    • ROUGE metric
    • Polynomial decay learning rate scheduler
    • A BatchCallback for logging CPU and GPU memory usage to tensorboard. This is mainly for debugging because using it can cause a significant slowdown in training.
    • Ability to run pretrained transformers as an embedder without training the weights

    Changed

    • Similar to our caching mechanism, we introduced a lock file to the vocab to avoid race conditions when saving/loading the vocab from/to the same serialization directory in different processes.
    • Changed the Token, Instance, and Batch classes along with all Field classes to "slots" classes. This dramatically reduces the size in memory of instances.
    • SimpleTagger will no longer calculate span-based F1 metric when calculate_span_f1 is False.
    • CPU memory for every worker is now reported in the logs and the metrics. Previously this was only reporting the CPU memory of the master process, and so it was only correct in the non-distributed setting.
    • To be consistent with PyTorch IterableDataset, AllennlpLazyDataset no longer implements __len__(). Previously it would always return 1.
    • Removed old tutorials, in favor of the new AllenNLP Guide
    • Changed the vocabulary loading to consider new lines for Windows/Linux and Mac.

    Commits

    d98d13b5 add 'allennlp_server' to default plugins (#4348) 33d0cd8c fix file utils test (#4349) f4d330a2 Update vocabulary load to a system-agnostic newline (#4342) 2012fea9 remove links to tutorials in API docs (#4346) 3d8ce442 Fixes spelling in changelog 73289bc8 Consistently use underscores in Predictor names (#4340) 2d03c413 Allow using pretrained transformers without fine-tuning them (#4338) 8f68d69b load plugins from Predictor.from_path (#4333) 5c6cc3a2 Bump mkdocs-material from 5.2.2 to 5.2.3 (#4341) 7ab7551b Removing old tutorials, pointing to the new guide in the README (#4334) 902d36a5 Fix bug with lazy data loading, un-implement len on AllennlpLazyDataset (#4328) 11b57996 log metrics in alphabetical order (#4327) 7d66b3e7 report CPU memory usage for each worker (#4323) 06bac68b make Instance, Batch, and all field classes "slots" classes (#4313) 2b2d1413 Bump mypy from 0.770 to 0.780 (#4316) a038c01a Update transformers requirement from <2.11,>=2.9 to >=2.9,<2.12 (#4315) 345459e9 Stop calculating span-based F1 metric when calculate_span_f1 is False. (#4302) fc47bf6a Deals with the case where a word doesn't have any word pieces assigned (#4301) 11a08ae7 Making Token class a "slots" class (#4312) 32bccfbd Fix a bug where predictor.get_gradients() would return an empty... (#4305) 33a49454 ensure CUDA available in GPU checks workflow (#4310) d51ffa11 Update transformers requirement from <2.10,>=2.9 to >=2.9,<2.11 (#4282) 75c07ab5 Merge branch 'master' of github.com:allenai/allennlp 8c9421da fix Makefile 77b432f6 Update README.md (#4309) 720ad434 A few small fixes in the README.md (#4307) a7265c04 move tensorboard memory logging to BatchCallback (#4306) 91d0fa1a remove setup.cfg (#4300) 5ad7a33a Support for bart in allennlp-models (#4169) 25134f2b add lock file within caching and vocab saving/loading mechanisms (#4299) 58dc84ea add 'Feature request' label to template 9526f007 Update issue templates (#4293) 79999ec0 Adds a "duplicate()" method on instances and fields (#4294) 8ff47d34 Set version to rc6

    Source code(tar.gz)
    Source code(zip)
Grading tools for Advanced NLP (11-711)Grading tools for Advanced NLP (11-711)

Grading tools for Advanced NLP (11-711) Installation You'll need docker and unzip to use this repo. For docker, visit the official guide to get starte

Hao Zhu 1 Oct 3, 2021
Develop open-source Python Arabic NLP libraries that the Arab world will easily use in all Natural Language Processing applications

Develop open-source Python Arabic NLP libraries that the Arab world will easily use in all Natural Language Processing applications

BADER ALABDAN 1 Jan 18, 2022
Awesome-NLP-Research (ANLP)

Awesome-NLP-Research (ANLP)

Language, Information, and Learning at Yale 67 Feb 20, 2022
Machine learning models from Singapore's NLP research community

SG-NLP Machine learning models from Singapore's natural language processing (NLP) research community. sgnlp is a Python package that allows you to eas

AI Singapore | AI Makerspace 9 Jan 14, 2022
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

MMF is a modular framework for vision and language multimodal research from Facebook AI Research. MMF contains reference implementations of state-of-t

Facebook Research 4.9k Apr 10, 2022
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari Overview | Performance | Installation | Documentation | Contributing ?? ?? ?? We released the 2.0.0 version with TF2 Support. ?? ?? ?? If you

Eliyar Eziz 2.3k Apr 2, 2022
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari Overview | Performance | Installation | Documentation | Contributing ?? ?? ?? We released the 2.0.0 version with TF2 Support. ?? ?? ?? If you

Eliyar Eziz 2k Feb 9, 2021
A list of NLP(Natural Language Processing) tutorials built on Tensorflow 2.0.

A list of NLP(Natural Language Processing) tutorials built on Tensorflow 2.0.

Won Joon Yoo 181 Apr 5, 2022
Pytorch NLP library based on FastAI

Quick NLP Quick NLP is a deep learning nlp library inspired by the fast.ai library It follows the same api as fastai and extends it allowing for quick

Agis pof 284 Mar 22, 2022
NLP codes implemented with Pytorch (w/o library such as huggingface)

NLP_scratch NLP codes implemented with Pytorch (w/o library such as huggingface) scripts ├── models: Neural Network models ├── data: codes for dataloa

null 3 Dec 28, 2021
A repo for open resources & information for people to succeed in PhD in CS & career in AI / NLP

A repo for open resources & information for people to succeed in PhD in CS & career in AI / NLP

null 346 Apr 8, 2022
Visual Automata is a Python 3 library built as a wrapper for Caleb Evans' Automata library to add more visualization features.

Visual Automata Copyright 2021 Lewi Lie Uberg Released under the MIT license Visual Automata is a Python 3 library built as a wrapper for Caleb Evans'

Lewi Uberg 46 Mar 29, 2022
Official Stanford NLP Python Library for Many Human Languages

Stanza: A Python NLP Library for Many Human Languages The Stanford NLP Group's official Python NLP library. It contains support for running various ac

Stanford NLP 6.1k Apr 13, 2022
NLP Core Library and Model Zoo based on PaddlePaddle 2.0

PaddleNLP 2.0拥有丰富的模型库、简洁易用的API与高性能的分布式训练的能力,旨在为飞桨开发者提升文本建模效率,并提供基于PaddlePaddle 2.0的NLP领域最佳实践。

null 3.1k Apr 7, 2022
Official Stanford NLP Python Library for Many Human Languages

Stanza: A Python NLP Library for Many Human Languages The Stanford NLP Group's official Python NLP library. It contains support for running various ac

Stanford NLP 5.2k Feb 12, 2021
Super easy library for BERT based NLP models

Fast-Bert New - Learning Rate Finder for Text Classification Training (borrowed with thanks from https://github.com/davidtvs/pytorch-lr-finder) Suppor

Utterworks 1.7k Apr 4, 2022
NLP library designed for reproducible experimentation management

Welcome to the Transfer NLP library, a framework built on top of PyTorch to promote reproducible experimentation and Transfer Learning in NLP You can

Feedly 290 Mar 27, 2022