This is a text summarizing tool written in Python

Overview

Summarize

Written by: Ling Li Ya

This is a text summarizing tool written in Python.

User Guide

Some things to note:

  • The application is accessible here.
  • However, due to limited free-tier server resources, the application may crash, so it is advisable that you run this project locally.
  • You might not be able to run the abstractive models after reaching a character limit in HuggingFace Accelerated Inference API. Therefore, it is advisable that you use the Notebooks for replicating our results in the documentation.
  • Note that you might not be able to run Pegasus on the notebook successfully due to the amount of resources required, so it is advisable that you run only the Pegasus model through the application interface.

To run the project locally, please refer to the guide below.

Setup Tutorial Video (Windows)

SummarizeLocalSetup.mp4

for the detailed steps in word, refer to sections below

1. Downloading the project

Either download the .zip file in Google Classroom from our GitHub. image

Then unzip the .zip file. You will see the file summarize-main. image

2. Install prerequisites

You need Python and Node.js installed. Open up command prompt (cmd) and type in the code below.

To check whether Python is installed:

$ python

You will see this is it is installed. Note that your version might be different.
image

Type exit() to exit the Python shell if it is installed.

To check whether Node.js is installed:

$ node

You will see this is it is installed. Note that your version might be different.
image

Otherwise, download Python and/or Node.js here. Run the installer and follow its instructions. Verify your installation.

3. Install project Python dependencies

Double click on summarize-main. Single click on the summarize folder, hold down your shift key, and right click on the folder. Select Open PowerShell window here. image

A PowerShell window will pop up. Then right click on the Makefile in the file explorer and open it with Notepad. image

Something like this will pop up: image

These are the commands to install all the project Python dependencies. Simply copy the command and paste them in the PowerShell window. If you encounter this warning message: image

Simply retype the command with an additional flag pip install -r requirements.txt --use-feature-in-tree-build. Then let it run. image

4. Install our summarize library

We have made our application into a Python library and you need to install it with the command below: image

5. Run the backend server

Be sure that you select the command under the server-dev instead of server-prod. image

6. Prepare the frontend client

Open up another PowerShell window this time by holding shift and right clicking the server folder.

After you have installed Node.js, run the following command to install pnpm.

$ npm install -g pnpm

After installing pnpm, type cd client to go into the client folder in the new PowerShell window.

Then return to your Notepad and run the command pnpm i in the PowerShell window. It will take 10 - 20 seconds to install. image

7. Run the frontend client

Run this command in the PowerShell window to launch the application on localhost:3333 image

You will see this: image

8. Adding API token

To use BART, T5 and Pegasus, you need an API token. We will private message you an API token because it is not supposed to be public.


At the summarize-main project root, right click on an empty space to add a new .txt named .env. image

Click on yes for this warning: image

Open the .env file in Notepad. Type in HUGGING_FACE_API_TOKEN_={your_api_token}. It will look something like this: image

Save the file then refresh the Summarize web application page. image

You will be able to use the models now.

Code folders

  • summarize - The python library for all the algorithm
  • server - The backend server using FastAPI
  • client - The frontend app using Vue3

Misc folders

  • notebooks - A folder to keep all our jupyter notebooks testground
  • data - A folder to keep all datasets needed to train or test the algorithm
  • docs - Keep our documentation files
Owner
Marcus Lee
Currently studying Software Engineering at TARUC, Kuala Lumpur. Mainly code in TypeScript, Golang, Python, Java Interested in Backend & Fullstack Dev.
Marcus Lee
This is REST-API for Indonesian Text Summarization using Non-Negative Matrix Factorization for the algorithm to summarize documents and FastAPI for the framework.

Indonesian Text Summarization Using FastAPI This is REST-API for Indonesian Text Summarization using Non-Negative Matrix Factorization for the algorit

Viqi Nurhaqiqi 2 Nov 03, 2022
A non-validating SQL parser module for Python

python-sqlparse - Parse SQL statements sqlparse is a non-validating SQL parser for Python. It provides support for parsing, splitting and formatting S

Andi Albrecht 3.1k Jan 04, 2023
Question answering on russian with XLMRobertaLarge as a service

QA Roberta Ru SaaS Question answering on russian with XLMRobertaLarge as a service. Thanks for the model to Alexander Kaigorodov. Stack Flask Gunicorn

Gladkikh Prohor 21 Jul 04, 2022
A query extract python package

A query extract python package

Fayas Noushad 4 Nov 28, 2021
Production First and Production Ready End-to-End Keyword Spotting Toolkit

WeKws Production First and Production Ready End-to-End Keyword Spotting Toolkit. The goal of this toolkit it to... Small footprint keyword spotting (K

222 Dec 30, 2022
Export solved codewars kata challenges to a text file.

Codewars Kata Exporter Note:this is not totally my work.i've edited the project to make more easier and faster for me.you can find the original work h

Oussama Ben Sassi 4 Aug 13, 2021
Simple python program to auto credit your code, text, book, whatever!

Credit Simple python program to auto credit your code, text, book, whatever! Setup First change credit_text to whatever text you would like to credit

Hashm 1 Jan 29, 2022
Python flexible slugify function

awesome-slugify Python flexible slugify function PyPi: https://pypi.python.org/pypi/awesome-slugify Github: https://github.com/dimka665/awesome-slugif

Dmitry Voronin 471 Dec 20, 2022
An experimental Fang Song style Chinese font generated with skeleton-tracing and pix2pix

An experimental Fang Song style Chinese font generated with skeleton-tracing and pix2pix, with glyphs based on cwTeXFangSong. The font is optimised fo

Lingdong Huang 98 Jan 07, 2023
Fixes mojibake and other glitches in Unicode text, after the fact.

ftfy: fixes text for you print(fix_encoding("(ง'⌣')ง")) (ง'⌣')ง Full documentation: https://ftfy.readthedocs.org Testimonials “My life is li

Luminoso Technologies, Inc. 3.4k Jan 08, 2023
A python tool to convert Bangla Bijoy text to Unicode text.

Unicode Converter A python tool to convert Bangla Bijoy text to Unicode text. Installation Unicode Converter can be installed via PyPi. Make sure pip

Shahad Mahmud 10 Sep 29, 2022
a python package that lets you add custom colors and text formatting to your scripts in a very easy way!

colormate Python script text formatting package What is colormate? colormate is a python library that lets you add text formatting to your scripts, it

Rodrigo 2 Dec 14, 2022
StealBit1.1 and earlier strings and config extraction scripts

StealBit1.1 and earlier scripts Use strings_decryptor.py to extract RC4 encrypted strings from a StealBit1.1 sample(s). Use config_extractor.py to ext

Soolidsnake 5 Dec 29, 2022
Maiden & Spell community player ranking based on tournament data.

MnSRank Maiden & Spell community player ranking based on tournament data. Why? 2021 just ended and this seemed like a cool idea. Elo doesn't work well

Jonathan Lee 1 Apr 20, 2022
Search for terms(word / table / field name or any) under Snowflake schema names

snowflake-search-terms-in-ddl-views Search for terms(word / table / field name or any) under Snowflake schema names Version : 1.0v How to use ? Run th

Igal Emona 1 Dec 15, 2021
Umamusume story patcher with python

umamusume-story-patcher How to use Go to your umamusume folder, usually C:\Users\user\AppData\LocalLow\Cygames\umamusume Make a mods folder and clon

8 May 07, 2022
Find a Doc is a free online resource aimed at helping connect the foreign community in Japan with health services in their native language.

Find a Doc - Localization Find a Doc is a free online resource aimed at helping connect the foreign community in Japan with health services in their n

Our Japan Life 18 Dec 19, 2022
ChirpText is a collection of text processing tools for Python 3.

ChirpText is a collection of text processing tools for Python 3. It is not meant to be a powerful tank like the popular NTLK but a small package which

Le Tuan Anh 5 Nov 30, 2022
汉字转拼音(pypinyin)

汉字拼音转换工具(Python 版) 将汉字转为拼音。可以用于汉字注音、排序、检索(Russian translation) 。 基于 hotoo/pinyin 开发。 Documentation: http://pypinyin.rtfd.io/ GitHub: https://github.co

Huang Huang 4.2k Jan 03, 2023
Extract knowledge from raw text

Extract knowledge from raw text This repository is a nearly copy-paste of "From Text to Knowledge: The Information Extraction Pipeline" with some cosm

Raphael Sourty 10 Dec 03, 2022