Multiple implementations for abstractive text summurization , using google colab

Overview

Text Summarization models

if you are able to endorse me on Arxiv, i would be more than glad https://arxiv.org/auth/endorse?x=FRBB89 thanks This repo is built to collect multiple implementations for abstractive approaches to address text summarization , for different languages (Hindi, Amharic, English, and soon isA Arabic)

If you found this project helpful please consider citing our work, it would truly mean so much for me

@INPROCEEDINGS{9068171,
  author={A. M. {Zaki} and M. I. {Khalil} and H. M. {Abbas}},
  booktitle={2019 14th International Conference on Computer Engineering and Systems (ICCES)}, 
  title={Deep Architectures for Abstractive Text Summarization in Multiple Languages}, 
  year={2019},
  volume={},
  number={},
  pages={22-27},}
@misc{zaki2020amharic,
    title={Amharic Abstractive Text Summarization},
    author={Amr M. Zaki and Mahmoud I. Khalil and Hazem M. Abbas},
    year={2020},
    eprint={2003.13721},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

it is built to simply run on google colab , in one notebook so you would only need an internet connection to run these examples without the need to have a powerful machine , so all the code examples would be in a jupiter format , and you don't have to download data to your device as we connect these jupiter notebooks to google drive

  • Arabic Summarization Model using the corner stone implemtnation (seq2seq using Bidirecional LSTM Encoder and attention in the decoder) for summarizing Arabic news
  • implementation A Corner stone seq2seq with attention (using bidirectional ltsm ) , three different models for this implemntation
  • implementation B seq2seq with pointer genrator model
  • implementation C seq2seq with reinforcement learning

Blogs

This repo has been explained in a series of Blogs


Try out this text summarization through this website (eazymind) , eazymind which enables you to summarize your text through

  • curl call
curl -X POST 
http://eazymind.herokuapp.com/arabic_sum/eazysum
-H 'cache-control: no-cache' 
-H 'content-type: application/x-www-form-urlencoded' 
-d "eazykey={eazymind api key}&sentence={your sentence to be summarized}"
from eazymind.nlp.eazysum import Summarizer

#---key from eazymind website---
key = "xxxxxxxxxxxxxxxxxxxxx"

#---sentence to be summarized---
sentence = """(CNN)The White House has instructed former
    White House Counsel Don McGahn not to comply with a subpoena
    for documents from House Judiciary Chairman Jerry Nadler, 
    teeing up the latest in a series of escalating oversight 
    showdowns between the Trump administration and congressional Democrats."""
    
summarizer = Summarizer(key)
print(summarizer.run(sentence))

Implementation A (seq2seq with attention and feature rich representation)

contains 3 different models that implements the concept of hving a seq2seq network with attention also adding concepts like having a feature rich word representation This work is a continuation of these amazing repos

Model 1

is a modification on of David Currie's https://github.com/Currie32/Text-Summarization-with-Amazon-Reviews seq2seq

Model 2

1- Model_2/Model_2.ipynb

a modification to https://github.com/dongjun-Lee/text-summarization-tensorflow

2- Model_2/Model 2 features(tf-idf , pos tags).ipynb

a modification to Model 2.ipynb by using concepts from http://www.aclweb.org/anthology/K16-1028

Results

A folder contains the results of both the 2 models , from validation text samples in a zaksum format , which is combining all of

  • bleu
  • rouge_1
  • rouge_2
  • rouge_L
  • rouge_be for each sentence , and average of all of them

Model 3

a modification to https://github.com/thomasschmied/Text_Summarization_with_Tensorflow/blob/master/summarizer_amazon_reviews.ipynb


Implementation B (Pointer Generator seq2seq network)

it is a continuation of the amazing work of https://github.com/abisee/pointer-generator https://arxiv.org/abs/1704.04368 this implementation uses the concept of having a pointer generator network to diminish some problems that appears with the normal seq2seq network

Model_4_generator_.ipynb

uses a pointer generator with seq2seq with attention it is built using python2.7

zaksum_eval.ipynb

built by python3 for evaluation

Results/Pointer Generator

  • output from generator (article / reference / summary) used as input to the zaksum_eval.ipynb
  • result from zaksum_eval

i will still work on their implementation of coverage mechanism , so much work is yet to come if God wills it isA


Implementation C (Reinforcement Learning For Sequence to Sequence )

this implementation is a continuation of the amazing work done by https://github.com/yaserkl/RLSeq2Seq https://arxiv.org/abs/1805.09461

@article{keneshloo2018deep,
 title={Deep Reinforcement Learning For Sequence to Sequence Models},
 author={Keneshloo, Yaser and Shi, Tian and Ramakrishnan, Naren and Reddy, Chandan K.},
 journal={arXiv preprint arXiv:1805.09461},
 year={2018}
}

Model 5 RL

this is a library for building multiple approaches using Reinforcement Learning with seq2seq , i have gathered their code to run in a jupiter notebook , and to access google drive built for python 2.7

zaksum_eval.ipynb

built by python3 for evaluation

Results/Reinforcement Learning

  • output from Model 5 RL used as input to the zaksum_eval.ipynb
Toy example of an applied ML pipeline for me to experiment with MLOps tools.

Toy Machine Learning Pipeline Table of Contents About Getting Started ML task description and evaluation procedure Dataset description Repository stru

Shreya Shankar 190 Dec 21, 2022
Auto_code_complete is a auto word-completetion program which allows you to customize it on your needs

auto_code_complete is a auto word-completetion program which allows you to customize it on your needs. the model for this program is one of the deep-learning NLP(Natural Language Process) model struc

RUO 2 Feb 22, 2022
Continuously update some NLP practice based on different tasks.

NLP_practice We will continuously update some NLP practice based on different tasks. prerequisites Software pytorch = 1.10 torchtext = 0.11.0 sklear

0 Jan 05, 2022
Code for "Parallel Instance Query Network for Named Entity Recognition", accepted at ACL 2022.

README Code for Two-stage Identifier: "Parallel Instance Query Network for Named Entity Recognition", accepted at ACL 2022. For details of the model a

Yongliang Shen 45 Nov 29, 2022
State of the art faster Natural Language Processing in Tensorflow 2.0 .

tf-transformers: faster and easier state-of-the-art NLP in TensorFlow 2.0 ****************************************************************************

74 Dec 05, 2022
Summarization module based on KoBART

KoBART-summarization Install KoBART pip install git+https://github.com/SKT-AI/KoBART#egg=kobart Requirements pytorch==1.7.0 transformers==4.0.0 pytor

seujung hwan, Jung 148 Dec 28, 2022
NewsMTSC: (Multi-)Target-dependent Sentiment Classification in News Articles

NewsMTSC: (Multi-)Target-dependent Sentiment Classification in News Articles NewsMTSC is a dataset for target-dependent sentiment classification (TSC)

Felix Hamborg 79 Dec 30, 2022
A high-level Python library for Quantum Natural Language Processing

lambeq About lambeq is a toolkit for quantum natural language processing (QNLP). Documentation: https://cqcl.github.io/lambeq/ Getting started Prerequ

Cambridge Quantum 315 Jan 01, 2023
A Chinese to English Neural Model Translation Project

ZH-EN NMT Chinese to English Neural Machine Translation This project is inspired by Stanford's CS224N NMT Project Dataset used in this project: News C

Zhenbang Feng 29 Nov 26, 2022
Binaural Speech Synthesis

Binaural Speech Synthesis This repository contains code to train a mono-to-binaural neural sound renderer. If you use this code or the provided datase

Facebook Research 135 Dec 18, 2022
👄 The most accurate natural language detection library for Python, suitable for long and short text alike

1. What does this library do? Its task is simple: It tells you which language some provided textual data is written in. This is very useful as a prepr

Peter M. Stahl 334 Dec 30, 2022
The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques

Unsupervised technique to Glossary and Definition Extraction Code Files GPT2-DefinitionModel.ipynb - GPT-2 model for definition generation. Data_Gener

Prakhar Mishra 28 May 25, 2021
Code for Emergent Translation in Multi-Agent Communication

Emergent Translation in Multi-Agent Communication PyTorch implementation of the models described in the paper Emergent Translation in Multi-Agent Comm

Facebook Research 75 Jul 15, 2022
A PyTorch implementation of VIOLET

VIOLET: End-to-End Video-Language Transformers with Masked Visual-token Modeling A PyTorch implementation of VIOLET Overview VIOLET is an implementati

Tsu-Jui Fu 119 Dec 30, 2022
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

CRNN paper:An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition 1. create your ow

Tsukinousag1 3 Apr 02, 2022
原神抽卡记录数据集-Genshin Impact gacha data

提要 持续收集原神抽卡记录中 可以使用抽卡记录导出工具导出抽卡记录的json,将json文件发送至[email protected],我会在清除个人信息后

117 Dec 27, 2022
Pangu-Alpha for Transformers

Pangu-Alpha for Transformers Usage Download MindSpore FP32 weights for GPU from here to data/Pangu-alpha_2.6B.ckpt Activate MindSpore environment and

One 5 Oct 01, 2022
Topic Inference with Zeroshot models

zeroshot_topics Table of Contents Installation Usage License Installation zeroshot_topics is distributed on PyPI as a universal wheel and is available

Rita Anjana 55 Nov 28, 2022
A PyTorch implementation of the Transformer model in "Attention is All You Need".

Attention is all you need: A Pytorch Implementation This is a PyTorch implementation of the Transformer model in "Attention is All You Need" (Ashish V

Yu-Hsiang Huang 7.1k Jan 05, 2023
Code to reprudece NeurIPS paper: Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks

Accelerated Sparse Neural Training: A Provable and Efficient Method to FindN:M Transposable Masks Recently, researchers proposed pruning deep neural n

itay hubara 4 Feb 23, 2022