Semantic similarity computation with different state-of-the-art metrics

Last update: Jun 22, 2022

Related tags

Overview

Semantic similarity computation with different state-of-the-art metrics

Description • Installation • Usage • License

Description

TaxoSS is a semantic similarity library for Python which implements the state-of-the-art semantic similarity metrics like Resnik, JCN, and HSS.

Requirements

Python 3.6 or later
NLTK
NumPy
Pandas

Installation

TaxoSS can be installed through pip (the Python package manager) in the following way:

pip install taxoss

Usage

Semantic similarity functions

You can compute the semantic similarity in the following way:

from TaxoSS.functions import semantic_similarity
semantic_similarity('brother', 'sister', 'hss')

3.353513521371089

The function semantic_similarity(word1, word2, kind, ic) has these options for the argument kind:

hss -> HSS (default)
wup -> WUP
lcs -> LC
path_sim -> Shortest Path
resnik -> Resnik
jcn -> Jiang-Conrath
lin -> Lin
seco -> Seco

For the argument ic see the following section.

Information Content

Using a Wikipedia copus for calculating the Information Content (default of the argument ic):

from TaxoSS.functions import semantic_similarity
semantic_similarity('cat', 'dog', 'resnik')

6.169410755220327

Calculating Information Conent from a given corpus:

from TaxoSS.calculate_IC import calculate_IC
from TaxoSS.functions import semantic_similarity

calculate_IC(path_to_corpus, path_to_save_IC_file)
semantic_similarity('cat', 'dog', 'resnik', path_to_save_IC_file)

with path_to_save_IC_file a path into the virtual environment TaxoSS package, e.g. venv/lib/python3.6/site-packages/TaxoSS/data/prova_IC.csv.

Benchmark

	HSS (ours)	HSS (ours)	WUP	WUP	LC	LC	Shortest Path	Shortest Path	Resnik	Resnik	Jiang-Conrath	Jiang-Conrath	Lin	Lin	Seco	Seco
	Pearson	Spearman	Pearson	Spearman	Pearson	Spearman	Pearson	Spearman	Pearson	Spearman	Pearson	Spearman	Pearson	Spearman	Pearson	Spearman
MEN	0.41	0.33	0.36	0.33	0.14	0.05	0.07	0.03	0.05	0.03	-0.05	-0.04	0.05	0.04	-0.01	0.03
MC30	0.74	0.69	0.74	0.73	0.33	0.21	0.22	0.3	0.13	0.03	-0.06	-0.01	0.05	0.01	0.13	-0.09
WSS	0.68	0.65	0.58	0.59	0.36	0.23	0.16	0.1	0.02	-0.03	0.04	0.06	0.03	0.06	-0.01	-0.04
Simlex999	0.4	0.38	0.45	0.43	0.26	0.15	0.2	0.16	-0.04	-0.04	0.12	0.14	0.12	0.14	-0.02	-0.08
MT287	0.46	0.31	0.4	0.28	0.26	0.12	0.11	0.11	0.03	0.04	0.18	0.16	0.22	0.17	0	-0.06
MT771	0.44	0.4	0.43	0.49	0.06	0.02	0.1	0.13	0	-0.01	0	0	0	0	-0.05	-0.03
Time per pair (s)	0.0007	0.0007	0.008	0.008	0.0055	0.0055	0.0064	0.0064	0.5586	0.5586	0.551	0.551	0.5866	0.5866	0.0013	0.0013

Semantic similarity computation with different state-of-the-art metrics

Related tags

Overview

Semantic similarity computation with different state-of-the-art metrics

Description

Requirements

Installation

Usage

Semantic similarity functions

Information Content

Benchmark

Owner

Training data extraction on GPT-2

[NeurIPS 2020] Code for the paper "Balanced Meta-Softmax for Long-Tailed Visual Recognition"

Realtime segmentation with ENet, the fast and accurate segmentation net.

TipToiDog - Tip Toi Dog With Python

An exploration of log domain "alternative floating point" for hardware ML/AI accelerators.

CLADE - Efficient Semantic Image Synthesis via Class-Adaptive Normalization (TPAMI 2021)

Adjusting for Autocorrelated Errors in Neural Networks for Time Series

Object detection, 3D detection, and pose estimation using center point detection:

Code for the Paper: Alexandra Lindt and Emiel Hoogeboom.

K-Means Clustering and Hierarchical Clustering Unsupervised Learning Solution in Python3.

It's a implement of this paper：Relation extraction via Multi-Level attention CNNs

🎁 3,000,000+ Unsplash images made available for research and machine learning

FACIAL: Synthesizing Dynamic Talking Face With Implicit Attribute Learning. ICCV, 2021.

Repo for the Video Person Clustering dataset, and code for the associated paper

D-NeRF: Neural Radiance Fields for Dynamic Scenes

Quadruped-command-tracking-controller - Quadruped command tracking controller (flat terrain)

Original Pytorch Implementation of FLAME: Facial Landmark Heatmap Activated Multimodal Gaze Estimation

MegEngine implementation of YOLOX

Codebase of deep learning models for inferring stability of mRNA molecules

Learning an Adaptive Meta Model-Generator for Incrementally Updating Recommender Systems