Galileo library for large scale graph training by JD

Last update: Nov 29, 2022

Related tags

Deep Learning galileo

Overview

近年来，图计算在搜索、推荐和风控等场景中获得显著的效果，但也面临超大规模异构图训练，与现有的深度学习框架Tensorflow和PyTorch结合等难题。

Galileo（伽利略）是一个图深度学习框架，具备超大规模、易使用、易扩展、高性能、双后端等优点，旨在解决超大规模图算法在工业级场景的落地难题，提供图神经网络和图嵌入等模型的训练评估及预测能力。

架构介绍

Galileo整体架构

Galileo图深度学习框架采用分层设计理念，主要分为分布式图引擎、图多后端框架、图模型三层。

分布式高性能图引擎：采用紧凑高效的内存结构表达图数据，能够以极低内存支持超大规模异构图；基于ZeroCopy机制实现全链路调用，高性能图查询和图采样。
图多后端框架：支持Tensorflow和PyTorch双后端，配置化单机分布式训练，支持Keras和Estimator训练，提供统一的图查询和图采样接口，易扩展。
图模型：遵循数据与模型解耦，提升代码复用性；基于组件化设计，降低模型实现难度，支持Message Passing范式编写图模型，也支持Python直接访问训练后端接口，易使用且灵活性高。

开始使用

我们提供了Galileo的pip和conda包，推荐在docker镜像中使用Galileo，免去了安装依赖包的烦恼。也可以从源码编译安装Galileo。

阅读入门教程开始使用Galileo。

如果Galileo目前实现的图模型无法满足需求，可以定制化图模型。

使用自己的图数据可以参考图数据准备。

如果图数据量大，可以参考分布式训练。

想要了解更多Galileo接口参考API文档。

联系我们

欢迎通过issue和邮件组（[email protected]）联系我们。

LICENSE

Galileo图深度学习框架使用Apache License 2.0许可。

致谢

Galileo图深度学习框架由京东集团-京东零售-技术与数据中心荣誉出品，在此感谢京东零售算法通道的大力支持，同时感谢商业提升事业部、搜索与推荐平台部等兄弟部门在开发及使用过程中提出的宝贵意见。

Large-Scale Pre-training for Person Re-identification with Noisy Labels (LUPerson-NL)

LUPerson-NL Large-Scale Pre-training for Person Re-identification with Noisy Labels (LUPerson-NL) The repository is for our CVPR2022 paper Large-Scale

43 Dec 26, 2022

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training By Likun Cai, Zhi Zhang, Yi Zhu, Li Zhang, Mu Li, Xiangyang Xue. This

290 Dec 29, 2022

A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019).

ClusterGCN ⠀⠀ A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019). A

697 Dec 27, 2022

Apache Spark - A unified analytics engine for large-scale data processing

Apache Spark Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an op

34.7k Jan 4, 2023

[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

Large Scale Image Completion via Co-Modulated Generative Adversarial Networks, ICLR 2021 (Spotlight) Demo | Paper [NEW!] Time to play with our interac

373 Jan 2, 2023

SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems

The SLIDE package contains the source code for reproducing the main experiments in this paper. Dataset The Datasets can be downloaded in Amazon-

72 Dec 16, 2022

This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.

Skeleton Aware Multi-modal Sign Language Recognition By Songyao Jiang, Bin Sun, Lichen Wang, Yue Bai, Kunpeng Li and Yun Fu. Smile Lab @ Northeastern

128 Dec 8, 2022

Official implementation of "Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets" (CVPR2021)

Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets This is the official implementation of "Towards Good Pract

52 Nov 22, 2022

Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

Portrait Photo Retouching with PPR10K Paper | Supplementary Material PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask an

184 Dec 11, 2022

Comments

galileo_convertor使用方法

请问：我安装了galileo的CPU docker，目前执行python3可以进入到环境里面。数据集选择cora的demo也可以跑起来。可是我们要基于一个大的数据集去做实验，根据github项目提示，我们需要进行数据转换，这就要用到转换工具galileo_convertor。可是我不知道怎么样才能运行它？各位老师快帮帮弟弟！

opened by jieheroli 6

Releases(v1.0.0)

v1.0.0(Sep 9, 2021)

Galileo 第一个版本1.0.0，更多信息参考Galileo的文档。
Source code(tar.gz)
Source code(zip)

Galileo library for large scale graph training by JD

Related tags

Overview

架构介绍

开始使用

联系我们

LICENSE

致谢

You might also like...

Large-Scale Pre-training for Person Re-identification with Noisy Labels (LUPerson-NL)

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019).

Apache Spark - A unified analytics engine for large-scale data processing

[ICLR 2021, Spotlight] Large Scale Image Completion via Co-Modulated Generative Adversarial Networks

SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems

This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.

Official implementation of "Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets" (CVPR2021)

Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

Comments

galileo_convertor使用方法

Releases(v1.0.0)

v1.0.0(Sep 9, 2021)

Owner

JD Galileo Team

Repo for the paper "DiLBERT: Cheap Embeddings for Disease Related Medical NLP"

Codes and models of NeurIPS2021 paper - DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks

Pytorch implement of 'Unmixing based PAN guided fusion network for hyperspectral imagery'

ObjectDetNet is an easy, flexible, open-source object detection framework

[AAAI22] Reliable Propagation-Correction Modulation for Video Object Segmentation

CRISCE: Automatically Generating Critical Driving Scenarios From Car Accident Sketches

This is the repository for The Machine Learning Workshops, published by AI DOJO

Compact Bidirectional Transformer for Image Captioning

The codes and models in 'Gaze Estimation using Transformer'.

Yolo Traffic Light Detection With Python

GB-CosFace: Rethinking Softmax-based Face Recognition from the Perspective of Open Set Classification

This tutorial repository is to introduce the functionality of KGTK to first-time users

Distinguishing Commercial from Editorial Content in News

This code uses generative adversarial networks to generate diverse task allocation plans for Multi-agent teams.

Repository for self-supervised landmark discovery

RobustVideoMatting and background composing in one model by using onnxruntime.

This code provides a PyTorch implementation for OTTER (Optimal Transport distillation for Efficient zero-shot Recognition), as described in the paper.

Re-implememtation of MAE (Masked Autoencoders Are Scalable Vision Learners) using PyTorch.

AI Flow is an open source framework that bridges big data and artificial intelligence.

Second-Order Neural ODE Optimizer, NeurIPS 2021 spotlight