Relevance Vector Machine implementation using the scikit-learn API.

Last update: Nov 18, 2022

Related tags

Overview

scikit-rvm

scikit-rvm is a Python module implementing the Relevance Vector Machine (RVM) machine learning technique using the scikit-learn API.

Quickstart

With NumPy, SciPy and scikit-learn available in your environment, install with:

pip install https://github.com/JamesRitchie/scikit-rvm/archive/master.zip

Regression is done with the RVR class:

>>> from skrvm import RVR
>>> X = [[0, 0], [2, 2]]
>>> y = [0.5, 2.5 ]
>>> clf = RVR(kernel='linear')
>>> clf.fit(X, y)
RVR(alpha=1e-06, beta=1e-06, beta_fixed=False, bias_used=True, coef0=0.0,
coef1=None, degree=3, kernel='linear', n_iter=3000,
threshold_alpha=1000000000.0, tol=0.001, verbose=False)
>>> clf.predict([[1, 1]])
array([ 1.49995187])

Classification is done with the RVC class:

>>> from skrvm import RVC
>>> from sklearn.datasets import load_iris
>>> clf = RVC()
>>> clf.fit(iris.data, iris.target)
RVC(alpha=1e-06, beta=1e-06, beta_fixed=False, bias_used=True, coef0=0.0,
coef1=None, degree=3, kernel='rbf', n_iter=3000, n_iter_posterior=50,
threshold_alpha=1000000000.0, tol=0.001, verbose=False)
>>> clf.score(iris.data, iris.target)
0.97999999999999998

Theory

The RVM is a sparse Bayesian analogue to the Support Vector Machine, with a number of advantages:

It provides probabilistic estimates, as opposed to the SVM's point estimates.
Typically provides a sparser solution than the SVM, which tends to have the number of support vectors grow linearly with the size of the training set.
Does not need a complexity parameter to be selected in order to avoid overfitting.

However it is more expensive to train than the SVM, although prediction is faster and no cross-validation runs are required.

The RVM's original creator Mike Tipping provides a selection of papers offering detailed insight into the formulation of the RVM (and sparse Bayesian learning in general) on a dedicated page, along with a Matlab implementation.

Most of this implementation was written working from Section 7.2 of Christopher M. Bishops's Pattern Recognition and Machine Learning.

Contributors

Future Improvements

Implement the fast Sequential Sparse Bayesian Learning Algorithm outlined in Section 7.2.3 of Pattern Recognition and Machine Learning
Handle ill-conditioning errors more gracefully.
Implement more kernel choices.
Create more detailed examples with IPython notebooks.

Relevance Vector Machine implementation using the scikit-learn API.

Related tags

Overview

scikit-rvm

Quickstart

Theory

Contributors

Future Improvements

Owner

James Ritchie

a distributed deep learning platform

A high performance and generic framework for distributed DNN training

Transpile trained scikit-learn estimators to C, Java, JavaScript and others.

A Python Module That Uses ANN To Predict A Stocks Price And Also Provides Accurate Technical Analysis With Many High Potential Implementations!

A library of sklearn compatible categorical variable encoders

🌲 Implementation of the Robust Random Cut Forest algorithm for anomaly detection on streams

Primitives for machine learning and data science.

This is a Cricket Score Predictor that predicts the first innings score of a T20 Cricket match using Machine Learning

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

Python Extreme Learning Machine (ELM) is a machine learning technique used for classification/regression tasks.

LibRerank is a toolkit for re-ranking algorithms. There are a number of re-ranking algorithms, such as PRM, DLCM, GSF, miDNN, SetRank, EGRerank, Seq2Slate.

Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.

2021 Machine Learning Security Evasion Competition

Distributed Evolutionary Algorithms in Python

A benchmark of data-centric tasks from across the machine learning lifecycle.

Fourier-Bayesian estimation of stochastic volatility models

Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

A Python library for choreographing your machine learning research.

dirty_cat is a Python module for machine-learning on dirty categorical variables.

Empyrial is a Python-based open-source quantitative investment library dedicated to financial institutions and retail investors