practNLPTools-lite

This project is a fork of biplab-iitb

Warning

CLI is only for example purpose don't use for long running jobs.

Get the very old code in devbranch or prior stable version oldVersion.

- this built might take you to practNLPTools which is testing ground for this repository so don’t worry.

Practical Natural Language Processing Tools for Humans. practNLPTools is a pythonic library over SENNA and Stanford Dependency Extractor.

name	status
PyPi	\|pypi status
travis	\|travis status
Documentation	\|doc status
dependency	\|dep status
blocker Pyupbot	\|blocker status
FOSSA	\|FOSSA Status

Documentation docs1 and docs2

Note

After version 0.3.0+ pntl should able to store the result into database for later usage if needed by installing below dependency.

pip install git+https://github.com/jawahar273/snowbase.git

QuickStart

Downlarding Stanford Parser JAR

To downlard the stanford-parser from github automatically and placing them inside the install direction.

pntl -I true
# downlards required file from github.

Running Predefine Examples Sentences

To run predefine example in batch mode(which has more than one list of examples).

pntl -SE home/user/senna -B true

Example

Batch mode means listed sentences.

..code:

# Example structure for predefine
# Sentences in the code.

sentences = [
    "This is line 1",
    "This is line 2",

]

To run predefine example in non batch mode.

pntl -SE home/user/senna

Running user given sentence

To run user given example using -S is

pntl -SE home/user/senna -S 'I am gonna make him an offer he can not refuse.'

Functionality

Semantic Role Labeling.
Syntactic Parsing.
Part of Speech Tagging (POS Tagging).
Named Entity Recognisation (NER).
Dependency Parsing.
Shallow Chunking.
Skip-gram(in-case).
find the senna path if is install in the system.
stanford parser and depPaser file into installed direction.

Future work

tag2file(new)
creating depParser for corresponding os environment
custome input format for stanford parser insted of tree format

Features

Fast: SENNA is written is C. So it is Fast.
We use only dependency Extractor Component of Stanford Parser, which takes in Syntactic Parse from SENNA and applies dependency Extraction. So there is no need to load parsing models for Stanford Parser, which takes time.
Easy to use.
Platform Supported - Windows, Linux and Mac
Automatic finds stanford parsing jar if it is present in install path[pntl].

Note

SENNA pipeline has a fixed maximum size of the sentences that it can read. By default it is 1024 token/sentence. If you have larger sentences, changing the MAX_SENTENCE_SIZE value in SENNA_main.c should beconsidered and your system specific binary should be rebuilt. Otherwise this could introduce misalignment errors.

Installation

Requires:

A computer with 500mb memory, Java Runtime Environment (1.7 preferably, works with 1.6 too, but didnt test.) installed and python.

Linux:

run:
sudo python setup.py install 
windows:

run this commands as administrator:
python setup.py install

Bench Mark comparsion

By using the time command in ubuntu on running the testsrl.py on this link and along with tools.py on pntl

	pntl	NLTK-senna
at fist run
	real 0m1.674s	real 0m2.484s
	user 0m1.564s	user 0m1.868s
	sys 0m0.228s	sys 0m0.524s
at second run
	real 0m1.245s	real 0m3.359s
	user 0m1.560s	user 0m2.016s
	sys 0m0.152s	sys 0m1.168s

Note

This benchmark may diffrent from system to sytem. The result produced here is from ububtu 4Gb RAM and i3 process.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Name		Name	Last commit message	Last commit date
Latest commit History 834 Commits
.github		.github
.pytest_cache		.pytest_cache
.vscode		.vscode
docs		docs
docs_pntl		docs_pntl
pntl		pntl
tests		tests
.coveragerc		.coveragerc
.editorconfig		.editorconfig
.env.test.bash		.env.test.bash
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.nojekyll		.nojekyll
.pre-commit-config.yaml		.pre-commit-config.yaml
.travis.yml		.travis.yml
AUTHORS.rst		AUTHORS.rst
CHANGELOG.rst		CHANGELOG.rst
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.rst		CONTRIBUTING.rst
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.rst		README.rst
make_docs.sh		make_docs.sh
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt
requirements_huey.txt		requirements_huey.txt
requirements_py3.6.txt		requirements_py3.6.txt
requirements_ujson.txt		requirements_ujson.txt
requirements_xxhash.txt		requirements_xxhash.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini
travis_pypi_setup.py		travis_pypi_setup.py

License

jawahar273/practNLPTools-lite

Folders and files

Latest commit

History

Repository files navigation

practNLPTools-lite

QuickStart

Downlarding Stanford Parser JAR

Running Predefine Examples Sentences

Example

Running user given sentence

Functionality

Future work

Features

Installation

Bench Mark comparsion

Credits

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages