for a paper about leveraging discourse markers for training new models

Last update: Nov 02, 2022

Related tags

Deep Learning TSLM-DISCOURSE-MARKERS

Overview

TSLM-DISCOURSE-MARKERS

Scope

This repository contains:

(1) Code to extract discourse markers from wikipedia (TSA).

(1) Code to extract significant discoßurse markers from predictions over a sample

Usage

Evaluation code:

Installation

Using pip:

pip install git+ssh://[email protected]/IBM/tslm-discourse-markers.git#egg=tslm-discourse-markers

Alternatively, you can first clone the code, and install the requirements:

1. git clone [email protected]:IBM/tslm-discousrse-markers.git
2. cd tslm-discourse-markers
3. pip install -r requirements.txt

You also need to download fasttext model: curl https://dl.fbaipublicfiles.com/fasttext/supervised-models/lid.176.bin -o ~/Downloads/lid.176.bin and spacy english model: python -m spacy download en_core_web_sm

Running

Citing tslm-discourse-markers

If you are using tslm-discourse-markers in a publication, please cite the following paper:

Liat Ein-Dor, Ilya Shnayderman, Artem Spector, Lena Dankin,Ranit Aharonov and Noam Slonim 2022 Fortunately, Discourse Markers Can Enhance Language Models for Sentiment Analysis. AAAI-2022.

Model

SenDM model can be found at: https://huggingface.co/ibm/tslm-discourse-markers

Loading dataset

import datasets

directory = 'dataset/WIKI_ENGLISH' datasets.load_dataset('csv', data_files={folder: [f'{directory}/{folder}/{folder}_*.csv.gz'] for folder in ['train', 'dev','test']})

Contributing

This project welcomes external contributions, if you would like to contribute please see further instructions here

Pull requests are very welcome! Make sure your patches are well tested. Ideally create a topic branch for every separate change you make. For example:

Fork the repo
Create your feature branch (git checkout -b my-new-feature)
Commit your changes (git commit -am 'Added some feature')
Push to the branch (git push origin my-new-feature)
Create new Pull Request

Changelog

Major changes are documented here.

Notes

If you have any questions or issues you can create a new issue here.

License

This code is distributed under Apache License 2.0. If you would like to see the detailed LICENSE click here.

Authors

The YASO dataset was collected by Liat Ein-Dor, Ilya Shnayderman, Artem Spector, Lena Dankin, Ranit Aharonov and Noam Slonim.

The code was written by Ilya Shnayderman.

for a paper about leveraging discourse markers for training new models

Related tags

Overview

TSLM-DISCOURSE-MARKERS

Scope

Usage

Citing tslm-discourse-markers

Model

Loading dataset

Contributing

Changelog

Notes

License

Authors

Owner

International Business Machines

SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs

Graph Robustness Benchmark: A scalable, unified, modular, and reproducible benchmark for evaluating the adversarial robustness of Graph Machine Learning.

Continuous Diffusion Graph Neural Network

Material for my PyConDE & PyData Berlin 2022 Talk "5 Steps to Speed Up Your Data-Analysis on a Single Core"

Latex code for making neural networks diagrams

Understanding the Effects of Datasets Characteristics on Offline Reinforcement Learning

Easy way to add GoogleMaps to Flask applications. maintainer: @getcake

Centroid-UNet is deep neural network model to detect centroids from satellite images.

👐OpenHands : Making Sign Language Recognition Accessible (WiP 🚧👷‍♂️🏗)

Sharpness-Aware Minimization for Efficiently Improving Generalization

AgeGuesser: deep learning based age estimation system. Powered by EfficientNet and Yolov5

torchlm is aims to build a high level pipeline for face landmarks detection, it supports training, evaluating, exporting, inference(Python/C++) and 100+ data augmentations

Using pytorch to implement unet network for liver image segmentation.

traiNNer is an open source image and video restoration (super-resolution, denoising, deblurring and others) and image to image translation toolbox based on PyTorch.

Approaches to modeling terrain and maps in python

MMdet2-based reposity about lightweight detection model: Nanodet, PicoDet.

The code from the paper Character Transformations for Non-Autoregressive GEC Tagging

Hcpy - Interface with Home Connect appliances in Python

A Transformer-Based Siamese Network for Change Detection

Realtime Face Anti Spoofing with Face Detector based on Deep Learning using Tensorflow/Keras and OpenCV