A fast hierarchical dimensionality reduction algorithm.

Last update: Dec 12, 2022

Related tags

Overview

h-NNE: Hierarchical Nearest Neighbor Embedding

A fast hierarchical dimensionality reduction algorithm.

h-NNE is a general purpose dimensionality reduction algorithm such as t-SNE and UMAP. It stands out for its speed, simplicity and the fact that it provides a hierarchy of clusterings as part of its projection process. The algorithm is inspired by the FINCH clustering algorithm. For more information on the structure of the algorithm, please look at our corresponding paper in ArXiv:

M. Saquib Sarfraz*, Marios Koulakis*, Constantin Seibold, Rainer Stiefelhagen. Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality Reduction. CVPR 2022.

More details are available in the project documentation.

Installation

The project is available in PyPI. To install run:

pip install hnne

How to use h-NNE

The HNNE class implements the common methods of the sklearn interface.

Simple projection example

import numpy as np
from hnne import HNNE

data = np.random.random(size=(1000, 256))

hnne = HNNE(dim=2)
projection = hnne.fit_transform(data)

Projecting on new points

hnne = HNNE()
projection = hnne.fit_transform(data)

new_data_projection = hnne.transform(new_data)

Demos

The following demo notebooks are available:

Citation

If you make use of this project in your work, it would be appreciated if you cite the hnne paper:

@article{hnne,
  title={Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality Reduction},
  author={M. Saquib Sarfraz, Marios Koulakis, Constantin Seibold, Rainer Stiefelhagen},
  booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2022}
}

If you make use of the clustering properties of the algorithm please also cite:

 @inproceedings{finch,
   author    = {M. Saquib Sarfraz and Vivek Sharma and Rainer Stiefelhagen},
   title     = {Efficient Parameter-free Clustering Using First Neighbor Relations},
   booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
   pages = {8934--8943},
   year  = {2019}
}

A fast hierarchical dimensionality reduction algorithm.

Related tags

Overview

h-NNE: Hierarchical Nearest Neighbor Embedding

Installation

How to use h-NNE

Simple projection example

Projecting on new points

Demos

Citation

Owner

Marios Koulakis

File-based TF-IDF: Calculates keywords in a document, using a word corpus.

PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language" from Meta AI

An easy-to-use Python module that helps you to extract the BERT embeddings for a large text dataset (Bengali/English) efficiently.

BERT Attention Analysis

Big Bird: Transformers for Longer Sequences

ChainKnowledgeGraph, 产业链知识图谱包括A股上市公司、行业和产品共3类实体

Code for lyric-section-to-comment generation based on huggingface transformers.

An open collection of annotated voices in Japanese language

PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop

Almost State-of-the-art Text Generation library

Yet another Python binding for fastText

Pytorch-Named-Entity-Recognition-with-BERT

PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer

Production First and Production Ready End-to-End Keyword Spotting Toolkit

🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

This repo contains simple to use, pretrained/training-less models for speaker diarization.

BeautyNet is an AI powered model which can tell you whether you're beautiful or not.

Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"

Mapping a variable-length sentence to a fixed-length vector using BERT model