HuSpaCy: industrial-strength Hungarian natural language processing

Overview

python version spacy PyPI - Wheel PyPI version license: Apache-2.0 License: CC BY-SA 4.0

Hits pip downloads Demo stars

HuSpaCy: Industrial-strength Hungarian NLP

HuSpaCy is a spaCy model and a library providing industrial-strength Hungarian language processing facilities. The released pipeline consists of a tokenizer, sentence splitter, lemmatizer, tagger (predicting morphological features as well), dependency parser and a named entity recognition module. Word and phrase embeddings are also available through spaCy's API. All models have high throughput, decent memory usage and close to state-of-the-art accuracy. A live demo is available here, model releases are published to Hugging Face Hub.

This repository contains material to build HuSpaCy's models from the ground up.

Installation

To get started using the tool, first, you need to do download the model. The easiest way to achieve this is fetch the model by installing the huspacy package from PyPI:

pip install huspacy

This utility package exposes convenience methods for downloading and using the latest model:

import huspacy

# Download the latest model
huspacy.download()

# Download the specified model 
huspacy.download(version="v0.4.2")

# Load the previously downloaded model (hu_core_news_lg)
nlp = huspacy.load()

Alternatively, one can install the latest model from Hugging Face Hub directly:

pip install https://huggingface.co/huspacy/hu_core_news_lg/resolve/main/hu_core_news_lg-any-py3-none-any.whl

To speed up inference using GPUs, CUDA support can be installed as described in https://spacy.io/usage.

Usage

HuSpaCy is fully compatible with spaCy's API, newcomers can easily get started using spaCy 101 guide.

Although HuSpacy models can be leaded with spacy.load(), the tool provides convenience methods to easily access downloaded models.

# Load the model using huspacy
import huspacy
nlp = huspacy.load()

# Load the mode using spacy.load()
import spacy
nlp = spacy.load("hu_core_news_lg")

# Load the model directly as a module
import hu_core_news_lg
nlp = hu_core_news_lg.load()

# Either way you get the same model and can start processing texts.
doc = nlp("Csiribiri csiribiri zabszalma - négy csillag közt alszom ma.")

Available Models

Currently, we provide a single large model which achieves a good balance between accuracy and processing speed. A demo of this model is available at Hugging Face Spaces. This default model (hu_core_news_lg) provides tokenization, sentence splitting, part-of-speech tagging (UD labels w/ detailed morphosyntactic features), lemmatization, dependency parsing and named entity recognition and ships with pretrained word vectors.

Models' changes are recorded in the changelog.

Development

Installing requirements

  • poetry install will install all the dependencies
  • For better performance you might need to reinstall spacy with GPU support, e.g. poetry add spacy[cuda92] will add support for CUDA 9.2

Repository structure

├── .github            -- Github configuration files
├── data               -- Data files
│   ├── external       -- External models required to train models (e.g. word vectors)
│   ├── processed      -- Processed data ready to feed spacy
│   └── raw            -- Raw data, mostly corpora as they are obtained from the web
├── hu_core_news_lg    -- Spacy 3.x project files for building a model for news texts
│   ├── configs        -- Spacy pipeline configuration files
│   ├── project.lock   -- Auto-generated project script
│   ├── project.yml    -- Spacy3 Project file describing steps needed to build the model
│   └── README.md      -- Instructions on building a model from scratch
├── huspacy            -- subproject for the PyPI distributable package
├── tools              -- Source package for tools
│   └── cli            -- Command line scripts (Python)
├── models             -- Trained models and their metadata
├── resources          -- Resource files
├── scripts            -- Bash scripts
├── tests              -- Test files 
├── CHANGELOG.md       -- Keeps the changelog
├── LICENSE            -- License file
├── poetry.lock        -- Locked poetry dependencies files
├── poetry.toml        -- Poetry configurations
├── pyproject.toml     -- Python project configutation, including dependencies managed with Poetry 
└── README.md          -- This file

Citing

If you use the models or this library in your research please cite this paper.
Additionally, please indicate the version of the model you used so that your research can be reproduced.

@misc{HuSpaCy:2021,
  title = {{HuSpaCy: an industrial-strength Hungarian natural language processing toolkit}},
  booktitle = {{XVIII. Magyar Sz{\'a}m{\'\i}t{\'o}g{\'e}pes Nyelv{\'e}szeti Konferencia}},
  author = {Orosz, Gy{\"o}rgy and Sz{\' a}nt{\' o}, Zsolt and Berkecz, P{\' e}ter and Szab{\' o}, Gerg{\H o} and Farkas, Rich{\' a}rd}, 
  location = {{Szeged}},
  year = {in press 2021},
}

License

This library is released under the Apache 2.0 License

The trained models have their own license (CC BY-SA 4.0) as described on the models page.

Contact

For feature request issues and bugs please use the GitHub Issue Tracker. Otherwise, please use the Discussion Forums.

Authors

HuSpaCy is implemented in the SzegedAI team, coordinated by Orosz György in the Hungarian AI National Laboratory, MILAB program.

Comments
  • Transformer v1

    Transformer v1

    Changes

    • Added configs for training a transformer model.
    • Tagger model changes from the "lg" tagger model: The tagger now also teaches lemmatizer, more specifically the edit tree lemmatizer.
    • Parser model changes from the "lg" parser model: The parser no longer teaches components (as it used to do with the "lg" model), only the sentencizer. Also, the parser component is no longer the same, it is now learning dependencies with the biaffine parser.
    • Ner model changes from the "lg" ner model: the ner component didn't change too much, only factory (on the components.ner config) variable changed, instead of plain "ner", "beam_ner" is used in the configuration now.

    spacy eval scores

    |components|dev|test| |-|-|-| |TOK|100.00|100.00| |TAG|97.78|97.36| |POS|97.79|97.39| |MORPH|93.29|93.63| |LEMMA|97.67|97.66| |UAS|91.68|91.01| |LAS|87.29|87.20| |NER P|91.97|91.37| |NER R|92.25|91.42| |NER F|92.11|91.40| |SENT P|97.53|98.23| |SENT R|98.41|98.89| |SENT F|97.97|98.56| |SPEED|3184|3262|

    Remaining tasks

    Add Zsolti's eval script to the pipeline Add Zsolti's and Peti's multiple root removal script to the pipeline

    opened by SzaboGergo01 3
  • zipfile.BadZipFile: Bad CRC-32 for file

    zipfile.BadZipFile: Bad CRC-32 for file

    Hi,

    I tried to install this tool with python3 and I have given the following error. Can you help me to solve this issue please?

    Best regards, László

    $ pip3 install https://github.com/oroszgy/spacy-hungarian-models/releases/download/hu_core_ud_lg-0.2.0/hu_core_ud_lg-0.2.0-py3-none-any.whl Downloading https://github.com/oroszgy/spacy-hungarian-models/releases/download/hu_core_ud_lg-0.2.0/hu_core_ud_lg-0.2.0-py3-none-any.whl (1362.0MB) Exception: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/pip/basecommand.py", line 215, in main status = self.run(options, args) File "/usr/lib/python3/dist-packages/pip/commands/install.py", line 353, in run wb.build(autobuilding=True) File "/usr/lib/python3/dist-packages/pip/wheel.py", line 749, in build self.requirement_set.prepare_files(self.finder) File "/usr/lib/python3/dist-packages/pip/req/req_set.py", line 380, in prepare_files ignore_dependencies=self.ignore_dependencies)) File "/usr/lib/python3/dist-packages/pip/req/req_set.py", line 620, in _prepare_file session=self.session, hashes=hashes) File "/usr/lib/python3/dist-packages/pip/download.py", line 821, in unpack_url hashes=hashes File "/usr/lib/python3/dist-packages/pip/download.py", line 663, in unpack_http_url unpack_file(from_path, location, content_type, link) File "/usr/lib/python3/dist-packages/pip/utils/__init__.py", line 617, in unpack_file flatten=not filename.endswith('.whl') File "/usr/lib/python3/dist-packages/pip/utils/__init__.py", line 506, in unzip_file data = zip.read(name) File "/usr/lib/python3.6/zipfile.py", line 1338, in read return fp.read() File "/usr/lib/python3.6/zipfile.py", line 858, in read buf += self._read1(self.MAX_N) File "/usr/lib/python3.6/zipfile.py", line 962, in _read1 self._update_crc(data) File "/usr/lib/python3.6/zipfile.py", line 890, in _update_crc raise BadZipFile("Bad CRC-32 for file %r" % self.name) zipfile.BadZipFile: Bad CRC-32 for file 'hu_core_ud_lg/hu_core_ud_lg-0.2.0/tagger/model'

    bug 
    opened by laklaja 3
  • Does this model support fine-grained UD features?

    Does this model support fine-grained UD features?

    In SpaCy one can access the fine-grained UD tags by the tag_ attribute (see documentation). In this model it only repeats the value of pos_.

    Is there any chance to get the CoNLL-U style FEATS from the tagged data for Hungarian?

    enhancement help wanted 
    opened by dlazesz 3
  • Spacy lemmatizer does not work with numbers as expected

    Spacy lemmatizer does not work with numbers as expected

    Kedves György, észrevettem, h a spacy nem mindíg jól lemmatizálja a (betűvel kiírt) számokat. Íme egy példa:

    import spacy
    import hu_core_ud_lg
    import pandas as pd
    
    nlp = hu_core_ud_lg.load() # 2-3 perc
    
    a = "nyolcvanöt"
    b = "nyolcvanhat"
    c = "nyolcvanhét" 
    d = [a, b, c] 
      
    df = pd.DataFrame(d, columns = ['datum']) 
    
    output_lemma = []
    
    for i in df.datum:
        mondat = ""
        doc = nlp(i)
        newtext = [(tok.lemma_, tok.is_title) for tok in doc]
        mondat = ' '.join([tok[0].title() if tok[1] == 1 else tok[0] for tok in newtext])
        output_lemma.append(mondat)
    
    output_lemma 
    ['nyolcvan', 'nyolcvanh', 'nyolcvanhét']
    

    Új vagyok a githubon, de nagyon szívesen segítenék a csomag fejlesztésében. Meg tudnád kérlek mondani, h ez reális nehézségű projekt lenne egy kezdő számára, vagy inkább érdemes előbb egy egyszerűbb feladat után néznem? Előre is nagyon köszönöm a válaszod!

    enhancement 
    opened by gaborstats 2
  • Error during download UD_Hungarian-Szeged

    Error during download UD_Hungarian-Szeged

    Error during make install. Have the permissions of this dependency changed?

    `mkdir -p ./data/raw/UD_Hungarian-Szeged git clone [email protected]:UniversalDependencies/UD_Hungarian-Szeged.git ./data/raw/UD_Hungarian-Szeged

    Cloning into './data/raw/UD_Hungarian-Szeged'... Host key verification failed. fatal: Could not read from remote repository.

    Please make sure you have the correct access rights and the repository exists. make: *** [data/raw/UD_Hungarian-Szeged] Error 128`

    opened by laklaja 2
  • BadZipFile error

    BadZipFile error

    I get an error when trying to install through pip.

    zipfile.BadZipFile: Bad CRC-32 for file 'hu_core_ud_lg/hu_core_ud_lg-0.2.0/tagger/model'

    any ideas?

    opened by begdaniel 2
  • Inkompatibilitás

    Inkompatibilitás

    Sajnos a kód (huspacy) és a nagy model(hu-core-news-lg) más spacy verziót kíván, és nincs közös halmazuk. A huspacy régebbi spacyt kíván, mint a model. Nem találtam megoldást az inkompatibilitás feloldására. Ebben kérnék segítséget.

    Köszönettel ! Attila

    bug 
    opened by VamperAta 1
  • Lookup lemmatizer

    Lookup lemmatizer

    Added Lookup Lemmatizer and also its usage in the hu_core_news_lg model, it returns a lemma from the token and its POS tag. I also added the lemma smoother to the hu_core_news_lg model and now it has an accuracy of 97.36%.

    enhancement lemmatizer 
    opened by qeterme 1
  • Fix CLI tool paths for hu_core_news_lg

    Fix CLI tool paths for hu_core_news_lg

    Changes proposed in this pull request:

    While training hu_core_news_lg, some CLI tools are referenced as executables in the PATH, causing errors during training. This PR replaces them with the appropriate tool from "tools/cli".

    After submitting

    • [x] All GitHub Actions jobs for my pull request have passed.
    opened by dvarnai 1
  • Bump pyyaml from 5.2 to 5.4

    Bump pyyaml from 5.2 to 5.4

    Bumps pyyaml from 5.2 to 5.4.

    Changelog

    Sourced from pyyaml's changelog.

    5.4 (2021-01-19)

    5.3.1 (2020-03-18)

    • yaml/pyyaml#386 -- Prevents arbitrary code execution during python/object/new constructor

    5.3 (2020-01-06)

    Commits
    • 58d0cb7 5.4 release
    • a60f7a1 Fix compatibility with Jython
    • ee98abd Run CI on PR base branch changes
    • ddf2033 constructor.timezone: _copy & deepcopy
    • fc914d5 Avoid repeatedly appending to yaml_implicit_resolvers
    • a001f27 Fix for CVE-2020-14343
    • fe15062 Add 3.9 to appveyor file for completeness sake
    • 1e1c7fb Add a newline character to end of pyproject.toml
    • 0b6b7d6 Start sentences and phrases for capital letters
    • c976915 Shell code improvements
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Bump pygments from 2.5.2 to 2.7.4

    Bump pygments from 2.5.2 to 2.7.4

    Bumps pygments from 2.5.2 to 2.7.4.

    Release notes

    Sourced from pygments's releases.

    2.7.4

    • Updated lexers:

      • Apache configurations: Improve handling of malformed tags (#1656)

      • CSS: Add support for variables (#1633, #1666)

      • Crystal (#1650, #1670)

      • Coq (#1648)

      • Fortran: Add missing keywords (#1635, #1665)

      • Ini (#1624)

      • JavaScript and variants (#1647 -- missing regex flags, #1651)

      • Markdown (#1623, #1617)

      • Shell

        • Lex trailing whitespace as part of the prompt (#1645)
        • Add missing in keyword (#1652)
      • SQL - Fix keywords (#1668)

      • Typescript: Fix incorrect punctuation handling (#1510, #1511)

    • Fix infinite loop in SML lexer (#1625)

    • Fix backtracking string regexes in JavaScript/TypeScript, Modula2 and many other lexers (#1637)

    • Limit recursion with nesting Ruby heredocs (#1638)

    • Fix a few inefficient regexes for guessing lexers

    • Fix the raw token lexer handling of Unicode (#1616)

    • Revert a private API change in the HTML formatter (#1655) -- please note that private APIs remain subject to change!

    • Fix several exponential/cubic-complexity regexes found by Ben Caller/Doyensec (#1675)

    • Fix incorrect MATLAB example (#1582)

    Thanks to Google's OSS-Fuzz project for finding many of these bugs.

    2.7.3

    ... (truncated)

    Changelog

    Sourced from pygments's changelog.

    Version 2.7.4

    (released January 12, 2021)

    • Updated lexers:

      • Apache configurations: Improve handling of malformed tags (#1656)

      • CSS: Add support for variables (#1633, #1666)

      • Crystal (#1650, #1670)

      • Coq (#1648)

      • Fortran: Add missing keywords (#1635, #1665)

      • Ini (#1624)

      • JavaScript and variants (#1647 -- missing regex flags, #1651)

      • Markdown (#1623, #1617)

      • Shell

        • Lex trailing whitespace as part of the prompt (#1645)
        • Add missing in keyword (#1652)
      • SQL - Fix keywords (#1668)

      • Typescript: Fix incorrect punctuation handling (#1510, #1511)

    • Fix infinite loop in SML lexer (#1625)

    • Fix backtracking string regexes in JavaScript/TypeScript, Modula2 and many other lexers (#1637)

    • Limit recursion with nesting Ruby heredocs (#1638)

    • Fix a few inefficient regexes for guessing lexers

    • Fix the raw token lexer handling of Unicode (#1616)

    • Revert a private API change in the HTML formatter (#1655) -- please note that private APIs remain subject to change!

    • Fix several exponential/cubic-complexity regexes found by Ben Caller/Doyensec (#1675)

    • Fix incorrect MATLAB example (#1582)

    Thanks to Google's OSS-Fuzz project for finding many of these bugs.

    Version 2.7.3

    (released December 6, 2020)

    ... (truncated)

    Commits
    • 4d555d0 Bump version to 2.7.4.
    • fc3b05d Update CHANGES.
    • ad21935 Revert "Added dracula theme style (#1636)"
    • e411506 Prepare for 2.7.4 release.
    • 275e34d doc: remove Perl 6 ref
    • 2e7e8c4 Fix several exponential/cubic complexity regexes found by Ben Caller/Doyensec
    • eb39c43 xquery: fix pop from empty stack
    • 2738778 fix coding style in test_analyzer_lexer
    • 02e0f09 Added 'ERROR STOP' to fortran.py keywords. (#1665)
    • c83fe48 support added for css variables (#1633)
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • hu_core_news_trf-ben token.children hibás

    hu_core_news_trf-ben token.children hibás

    Hiba leírása token.children mindig üres generátorral tér vissza a hu_core_news_trf modellben.

    Hiba előidézése Az alábbi kód szemlélteti (google colab környezetben):

    doc = nlp('Peti evett egy almát.')
    displacy.render(doc, style="dep", jupyter=True)
    
    for token in doc:
        print(token.text, token.head, [child for child in token.children])
    

    A displacy kimenete alapján helyesen elemzi a mondatot a modell, ezt megerősíti a kiírásnál, hogy helyes a token.head (a displacy kódjába ásva, kiderült az is token.head-et használ). A token.children elemit kiolvasva mégis üres listát kapunk.

    Peti evett []
    evett evett []
    egy almát []
    almát evett []
    . evett []
    

    Elvárt működés A token.children-nek az adott token gyerekeit kéne visszaadnia.

    További kontextus A fenti kódot hu_core_news_lg-on futtatva helyes kimenetet kapunk.

    Peti evett []
    evett evett [Peti, almát, .]
    egy almát []
    almát evett [egy]
    . evett []
    

    Eredetileg a DependencyMatcher használata közben vettem észre hibát, onnan sikerült idáig visszavezetnem a hiba forrását.

    bug parser 
    opened by boapps 2
  • Tokenization bug with !.

    Tokenization bug with !.

    Describe the bug When tokenizing text, for example: [token for token in nlp("A kutya evett egy csontot!.")] The expression !. is considered a single token, and is also combined with the preceding word's token. Problem also occurs with multiple exclamation marks, for example: !!. !!!!!!. ...but not with multiple periods, for example: !.. !!.. !!... <--- these work properly It also does not occur if it's not directly preceded by a word (for example: there's a space between them, like this: csontot !.) If there's a chain of this, for example: !.!.!.!.! <- then the entire chain is one token... for example: kutya!.!.!.!. is tokenized simply as kutya!.!.!.!.

    Expected behavior The exclamation mark and the periods should be separate tokens, like this: kutya!. <--- kutya ! . Note that question marks for example do behave like this, this bug only happens with exclamation marks (as far as I noticed)

    bug tokenizer 
    opened by speter00 1
Releases(huspacy-v0.6.0)
Owner
HuSpaCy
HuSpaCy: industrial-strength Hungarian natural language processing
HuSpaCy
Pytorch ImageNet1k Loader with Bounding Boxes.

ImageNet 1K Bounding Boxes For some experiments, you might wanna pass only the background of imagenet images vs passing only the foreground. Here, I'v

Amin Ghiasi 11 Oct 15, 2022
OpenDILab RL Kubernetes Custom Resource and Operator Lib

DI Orchestrator DI Orchestrator is designed to manage DI (Decision Intelligence) jobs using Kubernetes Custom Resource and Operator. Prerequisites A w

OpenDILab 205 Dec 29, 2022
Fine-grained Post-training for Improving Retrieval-based Dialogue Systems - NAACL 2021

Fine-grained Post-training for Multi-turn Response Selection Implements the model described in the following paper Fine-grained Post-training for Impr

Janghoon Han 83 Dec 20, 2022
ScaleNet: A Shallow Architecture for Scale Estimation

ScaleNet: A Shallow Architecture for Scale Estimation Repository for the code of ScaleNet paper: "ScaleNet: A Shallow Architecture for Scale Estimatio

Axel Barroso 34 Nov 09, 2022
Multi-Glimpse Network With Python

Multi-Glimpse Network Multi-Glimpse Network: A Robust and Efficient Classification Architecture based on Recurrent Downsampled Attention arXiv Require

9 May 10, 2022
MonoScene: Monocular 3D Semantic Scene Completion

MonoScene: Monocular 3D Semantic Scene Completion MonoScene: Monocular 3D Semantic Scene Completion] [arXiv + supp] | [Project page] Anh-Quan Cao, Rao

298 Jan 08, 2023
Pytorch implementation of AREL

Status: Archive (code is provided as-is, no updates expected) Agent-Temporal Attention for Reward Redistribution in Episodic Multi-Agent Reinforcement

8 Nov 25, 2022
Basics of 2D and 3D Human Pose Estimation.

Human Pose Estimation 101 If you want a slightly more rigorous tutorial and understand the basics of Human Pose Estimation and how the field has evolv

Sudharshan Chandra Babu 293 Dec 14, 2022
This repository contains the code for the CVPR 2020 paper "Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision"

Differentiable Volumetric Rendering Paper | Supplementary | Spotlight Video | Blog Entry | Presentation | Interactive Slides | Project Page This repos

697 Jan 06, 2023
Markov Attention Models

Introduction This repo contains code for reproducing the results in the paper Graphical Models with Attention for Context-Specific Independence and an

Vicarious 0 Dec 09, 2021
Revitalizing CNN Attention via Transformers in Self-Supervised Visual Representation Learning

Revitalizing CNN Attention via Transformers in Self-Supervised Visual Representation Learning This repository is the official implementation of CARE.

ChongjianGE 89 Dec 02, 2022
FinRL­-Meta: A Universe for Data­-Driven Financial Reinforcement Learning. 🔥

FinRL-Meta: A Universe of Market Environments. FinRL-Meta is a universe of market environments for data-driven financial reinforcement learning. Users

AI4Finance Foundation 543 Jan 08, 2023
Understanding the Effects of Datasets Characteristics on Offline Reinforcement Learning

Understanding the Effects of Datasets Characteristics on Offline Reinforcement Learning Kajetan Schweighofer1, Markus Hofmarcher1, Marius-Constantin D

Institute for Machine Learning, Johannes Kepler University Linz 17 Dec 28, 2022
Deep Learning agent of Starcraft2, similar to AlphaStar of DeepMind except size of network.

Introduction This repository is for Deep Learning agent of Starcraft2. It is very similar to AlphaStar of DeepMind except size of network. I only test

Dohyeong Kim 136 Jan 04, 2023
[AAAI 2022] Separate Contrastive Learning for Organs-at-Risk and Gross-Tumor-Volume Segmentation with Limited Annotation

A paper Introduction This is an official release of the paper Separate Contrastive Learning for Organs-at-Risk and Gross-Tumor-Volume Segmentation wit

Jiacheng Wang 14 Dec 08, 2022
Code for the USENIX 2017 paper: kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels

kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels Blazing fast x86-64 VM kernel fuzzing framework with performant VM reloads for Linux, MacOS an

Chair for Sys­tems Se­cu­ri­ty 541 Nov 27, 2022
Efficient Speech Processing Tookit for Automatic Speaker Recognition

Sugar Efficient Speech Processing Tookit for Automatic Speaker Recognition | HuggingFace | What's New EfficientTDNN: Efficient Architecture Search for

WangRui 14 Sep 14, 2022
DetCo: Unsupervised Contrastive Learning for Object Detection

DetCo: Unsupervised Contrastive Learning for Object Detection arxiv link News Sparse RCNN+DetCo improves from 45.0 AP to 46.5 AP(+1.5) with 3x+ms trai

Enze Xie 234 Dec 18, 2022
A DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.

Note: This is an alpha (preview) version which is still under refining. nn-Meter is a novel and efficient system to accurately predict the inference l

Microsoft 244 Jan 06, 2023
FaRL for Facial Representation Learning

FaRL for Facial Representation Learning This repo hosts official implementation of our paper General Facial Representation Learning in a Visual-Lingui

Microsoft 19 Jan 05, 2022