Microsoft Distributed Machine Learning Toolkit

Overview

DMTK

Distributed Machine Learning Toolkit https://www.dmtk.io Please open issues in the project below. For any technical support email to [email protected]

DMTK includes the following projects:

  • DMTK framework(Multiverso): The parameter server framework for distributed machine learning.
  • LightLDA: Scalable, fast and lightweight system for large-scale topic modeling.
  • LightGBM: LightGBM is a fast, distributed, high performance gradient boosting (GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
  • Distributed word embedding: Distributed algorithm for word embedding implemented on multiverso.

Updates

2017-02-04

  • A tutorial on the latests updates of Distributed Machine Learning is presented on AAAI 2017. you can download the slides here.

2016-11-21

  • Multiverso has been officially used in Microsoft CNTK to power its ASGD parallel training.

2016-10-17

  • LightGBM has been released. which is a fast, distributed, high performance gradient boosting (GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

2016-09-12

  • A talk on the latest updates of DMTK is presented on GTC China. We also described the latest research work from our team, including the lightRNN(to be appeared in NIPS2016) and DC-ASGD.

2016-07-05

  • Multiverso has been upgrade to new API.Overview
  • Deep learning framework (torch/theano) support has been added.
  • Python/Lua bidding has been supported, you can using multiverso with Python/Lua.

Microsoft Open Source Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Issues
  • Support for Docker

    Support for Docker

    Please add Dockerfile to support docker install.

    opened by loretoparisi 4
  • ML functions

    ML functions

    Hi, Thanks for the great open source library. I was wondering if there are popular machine learning methods in DMTK that I can try such as random forest, SVM, KNN, PLS, NN, Tree Bagging, etc.?

    opened by jaelim 3
  • The code of lightRNN?

    The code of lightRNN?

    Hi, When are you going to publish the lightRNN code? and do you plan to release it with the other framework?

    opened by ayumiymk 2
  • Update README.md

    Update README.md

    Made explicit https calls

    cla-not-required 
    opened by alanyee 1
  • FIX: tookit -> toolkit

    FIX: tookit -> toolkit

    Page title also should be corrected.

    opened by akadig 0
  • lightlda zeromq msg type error

    lightlda zeromq msg type error

    Hi, i install lightlda on CentOS6 when i run ./example/nytimes.sh, it report: [INFO] [2015-11-26 18:26:59] INFO: block = 0, the number of slice = 1 [INFO] [2015-11-26 18:26:59] Server 0 starts: num_workers=1 endpoint=inproc://server [INFO] [2015-11-26 18:26:59] Server 0: Worker registratrion completed: workers=1 trainers=1 servers=1 [INFO] [2015-11-26 18:26:59] Rank 0/1: Multiverso initialized successfully. [INFO] [2015-11-26 18:27:00] Rank 0/1: Begin of configuration and initialization. Assertion failed: check () (src/msg.cpp:248) Aborted I checked that the error seems derives from multiverso/third_party/zeromq-4.1.3/src/msg.cpp hope for help

    opened by psy2013GitHub 0
  • Revert

    Revert "Update README.md"

    Reverts Microsoft/DMTK#6

    opened by ghost 0
  • dmtk.io HTTPS not forced, certificate expired

    dmtk.io HTTPS not forced, certificate expired

    https://www.dmtk.io/

    Websites prove their identity via certificates, which are valid for a set time period. The certificate for www.dmtk.io expired on 10/19/2017.
     
    Error code: SEC_ERROR_EXPIRED_CERTIFICATE
    

    http://www.dmtk.io/ does not redirect/enforce to HTTPS

    opened by wesinator 0
Owner
Microsoft
Open source projects and samples from Microsoft
Microsoft
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

English | 简体中文 Welcome to the PaddlePaddle GitHub. PaddlePaddle, as the only independent R&D deep learning platform in China, has been officially open

null 17.8k Mar 11, 2022
Distributed machine learning platform

Veles Distributed platform for rapid Deep learning application development Consists of: Platform - https://github.com/Samsung/veles Znicz Plugin - Neu

Samsung 897 Mar 6, 2022
Framework and Library for Distributed Online Machine Learning

Jubatus The Jubatus library is an online machine learning framework which runs in distributed environment. See http://jubat.us/ for details. Quick Sta

Jubatus 703 Jan 4, 2022
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

Ray provides a simple, universal API for building distributed applications. Ray is packaged with the following libraries for accelerating machine lear

null 19.4k Mar 9, 2022
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Horovod Horovod is a distributed deep learning training framework for TensorFlow, Keras, PyTorch, and Apache MXNet. The goal of Horovod is to make dis

Horovod 12.2k Mar 4, 2022
Ray provides a simple, universal API for building distributed applications.

An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

null 19.4k Mar 4, 2022
Distributed Synchronization for Python

Distributed Synchronization for Python Tutti is a nearly drop-in replacement for python's built-in synchronization primitives that lets you fearlessly

Hamilton Kibbe 3 Oct 22, 2021
A lightweight python module for building event driven distributed systems

Eventify A lightweight python module for building event driven distributed systems. Installation pip install eventify Problem Developers need a easy a

Eventify 14 Jul 10, 2021
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

CNTK Chat Windows build status Linux build status The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes

Microsoft 17.1k Mar 9, 2022
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

CNTK Chat Windows build status Linux build status The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes

Microsoft 17k Feb 11, 2021
Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft 298 Feb 26, 2022
XGBoost-Ray is a distributed backend for XGBoost, built on top of distributed computing framework Ray.

XGBoost-Ray is a distributed backend for XGBoost, built on top of distributed computing framework Ray.

null 76 Feb 11, 2022
Distributed-systems-algos - Distributed Systems Algorithms For Python

Distributed Systems Algorithms ISIS algorithm In an asynchronous system that kee

Tony Joo 3 Jan 8, 2022
Microsoft Machine Learning for Apache Spark

Microsoft Machine Learning for Apache Spark MMLSpark is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark

Microsoft Azure 3.2k Mar 4, 2022
Azure Cloud Advocates at Microsoft are pleased to offer a 12-week, 24-lesson curriculum all about Machine Learning

Azure Cloud Advocates at Microsoft are pleased to offer a 12-week, 24-lesson curriculum all about Machine Learning

Microsoft 29.3k Mar 6, 2022
Causal Inference and Machine Learning in Practice with EconML and CausalML: Industrial Use Cases at Microsoft, TripAdvisor, Uber

Causal Inference and Machine Learning in Practice with EconML and CausalML: Industrial Use Cases at Microsoft, TripAdvisor, Uber

EconML/CausalML KDD 2021 Tutorial 78 Mar 4, 2022
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

English | 简体中文 Welcome to the PaddlePaddle GitHub. PaddlePaddle, as the only independent R&D deep learning platform in China, has been officially open

null 17.7k Mar 3, 2022
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

English | 简体中文 Welcome to the PaddlePaddle GitHub. PaddlePaddle, as the only independent R&D deep learning platform in China, has been officially open

null 17.8k Mar 11, 2022
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 13.5k Mar 2, 2022
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 13.5k Mar 11, 2022
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 13.5k Mar 8, 2022
Distributed machine learning platform

Veles Distributed platform for rapid Deep learning application development Consists of: Platform - https://github.com/Samsung/veles Znicz Plugin - Neu

Samsung 897 Mar 6, 2022
Framework and Library for Distributed Online Machine Learning

Jubatus The Jubatus library is an online machine learning framework which runs in distributed environment. See http://jubat.us/ for details. Quick Sta

Jubatus 703 Jan 4, 2022
Uber Open Source 1.4k Mar 1, 2022
🎛 Distributed machine learning made simple.

?? lazycluster Distributed machine learning made simple. Use your preferred distributed ML framework like a lazy engineer. Getting Started • Highlight

Machine Learning Tooling 43 Nov 23, 2021
Management of exclusive GPU access for distributed machine learning workloads

TensorHive is an open source tool for managing computing resources used by multiple users across distributed hosts. It focuses on granting

Paweł Rościszewski 117 Jan 31, 2022
Bark Toolkit is a toolkit wich provides Denial-of-service attacks, SMS attacks and more.

Bark Toolkit About Bark Toolkit Bark Toolkit is a set of tools that provides denial of service attacks. Bark Toolkit includes SMS attack tool, HTTP

null 4 Feb 20, 2022
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 22.3k Mar 10, 2022