Python library for parsing resumes using natural language processing and machine learning

Last update: Jul 29, 2021

Overview

CVParser

Python library for parsing resumes using natural language processing and machine learning.

Setup

Installation on Linux and Mac OS

Follow the guide here on how to clone or fork a repo
Follow the guide here on how to create virtualenv

To create a normal virtualenv (example myvenv) and activate it (see Code below).

$ virtualenv --python=python3 myvenv

$ source myvenv/bin/activate

(myvenv) $ pip install -r requirements.txt

Usage

from cvparser.parser import CVParser

CVParser.download_nlk_data()


parser = CVParser(file_path="path/to/file.[pdf|doc|docx|png|jpeg]")
parser.parse()
print(parser.json())

Re-training the Model

cd into the train folder.
Delete the folder model and the file train.json.
Copy your new training data into the train folder. The train data must be in json. This can be generated using the data annotation tool called Dataturk. The file containing the training data must be named train.json.
Then, start re-training the model by execute the python script in the train folder named manual_training.py.
Then test your new model by #usage .

Python library for parsing resumes using natural language processing and machine learning

Related tags

Overview

CVParser

Setup

Installation on Linux and Mac OS

Usage

Re-training the Model

Owner

nafiu

This repository contains helper functions which can help you generate additional data points depending on your NLP task.

GSoC'2021 | TensorFlow implementation of Wav2Vec2

Simple program that translates the name of files into English

Associated Repository for "Translation between Molecules and Natural Language"

Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

News-Articles-and-Essays - NLP (Topic Modeling and Clustering)

Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS)

CCKS-Title-based-large-scale-commodity-entity-retrieval-top1

Natural Language Processing

Chinese NER(Named Entity Recognition) using BERT(Softmax, CRF, Span)

Utility for Google Text-To-Speech batch audio files generator. Ideal for prompt files creation with Google voices for application in offline IVRs

Python library for parsing resumes using natural language processing and machine learning

KakaoBrain KoGPT (Korean Generative Pre-trained Transformer)

BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

Pre-Training with Whole Word Masking for Chinese BERT

GAP-text2SQL: Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

A PyTorch-based model pruning toolkit for pre-trained language models

Pipeline for fast building text classification TF-IDF + LogReg baselines.

Product-Review-Summarizer - Created a product review summarizer which clustered thousands of product reviews and summarized them into a maximum of 500 characters, saving precious time of customers and helping them make a wise buying decision.

A CSRankings-like index for speech researchers