Paddle Graph Learning (PGL) is an efficient and flexible graph learning framework based on PaddlePaddle

Last update: Jan 06, 2023

Overview

Breaking News !!

🔥 🔥 🔥 OGB-LSC KDD CUP 2021 winners announced!! (2021.06.17)

Super excited to announce our PGL team won TWO FIRST place and ONE SECOND place in a total of three track in OGB-LSC KDD CUP 2021. Leaderboards can be found here.

First place in MAG240M-LSC track: Code and Technical Report can be found here.
First place in WikiKG90M-LSC track: Code and Technical Report can be found here.
Second place in PCQM4M-LSC track: Code and Technical Report can be found here.

Two amazing paper using PGL are accepted: (2021.06.17)

Masked Label Prediction: Unified Message Passing Model for Semi-Supervised Classification, to appear in IJCAI2021.
HGAMN: Heterogeneous Graph Attention Matching Network for Multilingual POI Retrieval at Baidu Maps, to appear in KDD2021.

PGL Dstributed Graph Engine API released!!

Our Dstributed Graph Engine API has been released and we developed a tutorial to show how to launch a graph engine and a demo for training model using graph engine.

PGL v2.1 2021.02.02

We are now support dygraph version of PaddlePaddle 2.0, and release PGL v2.1.
You can find the stable staic version of PGL in the branch "static_stable"

PGL v1.2 2020.11.20

The PGL team proposed a new Unified Message Passing Model (UniMP), and achieved the State of the Art on three tasks on the OGB leaderboards. You can find the code here.
The PGL team proposed a two-stage recall and ranking model based on ERNIEsage, and won the first place in the TextGraphs-2020 competition co-organized by COLING.
The PGL team worked hard to develop an open course of Graph Neural Network (GNN), which will help you getting started with Graph Neural Network in seven days. Details can be found in course.

PGL v1.1 2020.4.29

You can find ERNIESage, a novel model for modeling text and graph structures, and its introduction here.
PGL for Open Graph Benchmark examples can be found here.
We add newly graph level operators like GraphPooling and GraphNormalization for graph level predictions.
We relase a PGL-KE toolkit here including classical knowledge graph embedding t algorithms like TransE, TransR, RotatE.

Paddle Graph Learning (PGL) is an efficient and flexible graph learning framework based on PaddlePaddle.

The newly released PGL supports heterogeneous graph learning on both walk based paradigm and message-passing based paradigm by providing MetaPath sampling and Message Passing mechanism on heterogeneous graph. Furthermor, The newly released PGL also support distributed graph storage and some distributed training algorithms, such as distributed deep walk and distributed graphsage. Combined with the PaddlePaddle deep learning framework, we are able to support both graph representation learning models and graph neural networks, and thus our framework has a wide range of graph-based applications.

One of the most important benefits of graph neural networks compared to other models is the ability to use node-to-node connectivity information, but coding the communication between nodes is very cumbersome. At PGL we adopt Message Passing Paradigm similar to DGL to help to build a customize graph neural network easily. Users only need to write send and recv functions to easily implement a simple GCN. As shown in the following figure, for the first step the send function is defined on the edges of the graph, and the user can customize the send function to send the message from the source to the target node. For the second step, the recv function is responsible for aggregating messages together from different sources.

To write a sum aggregator, users only need to write the following codes.

    import pgl
    import paddle
    import numpy as np

    
    num_nodes = 5
    edges = [(0, 1), (1, 2), (3, 4)]
    feature = np.random.randn(5, 100).astype(np.float32)

    g = pgl.Graph(num_nodes=num_nodes,
        edges=edges,
        node_feat={
            "h": feature
        })
    g.tensor()

    def send_func(src_feat, dst_feat, edge_feat):
        return src_feat

    def recv_func(msg):
        return msg.reduce_sum(msg["h"]) 
     
    msg = g.send(send_func, src_feat=g.node_feat)

    ret = g.recv(recv_func, msg)

Highlight: Flexibility - Natively Support Heterogeneous Graph Learning

Graph can conveniently represent the relation between things in the real world, but the categories of things and the relation between things are various. Therefore, in the heterogeneous graph, we need to distinguish the node types and edge types in the graph network. PGL models heterogeneous graphs that contain multiple node types and multiple edge types, and can describe complex connections between different types.

Support meta path walk sampling on heterogeneous graph

The left side of the figure above describes a shopping social network. The nodes above have two categories of users and goods, and the relations between users and users, users and goods, and goods and goods. The right of the above figure is a simple sampling process of MetaPath. When you input any MetaPath as UPU (user-product-user), you will find the following results

Then on this basis, and introducing word2vec and other methods to support learning metapath2vec and other algorithms of heterogeneous graph representation.

Support Message Passing mechanism on heterogeneous graph

Because of the different node types on the heterogeneous graph, the message delivery is also different. As shown on the left, it has five neighbors, belonging to two different node types. As shown on the right of the figure above, nodes belonging to different types need to be aggregated separately during message delivery, and then merged into the final message to update the target node. On this basis, PGL supports heterogeneous graph algorithms based on message passing, such as GATNE and other algorithms.

Large-Scale: Support distributed graph storage and distributed training algorithms

In most cases of large-scale graph learning, we need distributed graph storage and distributed training support. As shown in the following figure, PGL provided a general solution of large-scale training, we adopted PaddleFleet as our distributed parameter servers, which supports large scale distributed embeddings and a lightweighted distributed storage engine so it can easily set up a large scale distributed training algorithm with MPI clusters.

Model Zoo

The following graph learning models have been implemented in the framework. You can find more examples and the details here.

Model	feature
ERNIESage	ERNIE SAmple aggreGatE for Text and Graph
GCN	Graph Convolutional Neural Networks
GAT	Graph Attention Network
GraphSage	Large-scale graph convolution network based on neighborhood sampling
unSup-GraphSage	Unsupervised GraphSAGE
LINE	Representation learning based on first-order and second-order neighbors
DeepWalk	Representation learning by DFS random walk
MetaPath2Vec	Representation learning based on metapath
Node2Vec	The representation learning Combined with DFS and BFS
Struct2Vec	Representation learning based on structural similarity
SGC	Simplified graph convolution neural network
GES	The graph represents learning method with node features
DGI	Unsupervised representation learning based on graph convolution network
GATNE	Representation Learning of Heterogeneous Graph based on MessagePassing

The above models consists of three parts, namely, graph representation learning, graph neural network and heterogeneous graph learning, which are also divided into graph representation learning and graph neural network.

System requirements

PGL requires:

paddlepaddle >= 2.2.0
cython

PGL only supports Python 3

Installation

You can simply install it via pip.

pip install pgl

The Team

PGL is developed and maintained by NLP and Paddle Teams at Baidu

E-mail: nlp-gnn[at]baidu.com

License

PGL uses Apache License 2.0.

Comments

How were the hyperparameters chosen for OGB?

^

You write in the paper the parameters that were selected, but could you go into more detail about how the hyperparameters were chosen? Like, what ranges were you looking at, etc.?

opened by Chillee 17
【windows】pgl ImportError: DLL load failed: 找不到指定的模块。
win10+py 3.6+paddle 1.8.0 +pgl 1.1.0 你好，在pip install pgl成功之后; import pgl 时报错

import pgl.graph_kernel as graph_kernel ImportError: DLL load failed: 找不到指定的模块。

请问该如何解决
opened by yhx0105 8
multiprocessing context has already been set

---------------------------------------------------------------------------RuntimeError Traceback (most recent call last)/tmp/ipykernel_161/639131080.py in ----> 1 import pgl.utils.data /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/pgl/utils/data/init.py in 1 #-- coding: utf-8 -- 2 from .dataset import Dataset, StreamDataset, HadoopDataset ----> 3 from .dataloader import Dataloader /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/pgl/utils/data/dataloader.py in 21 import paddle 22 ---> 23 from pgl.utils import mp_reader 24 from pgl.utils.data.dataset import Dataset, StreamDataset 25 from pgl.utils.data.sampler import Sampler, StreamSampler /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/pgl/utils/mp_reader.py in 17 log = logging.getLogger(name) 18 import multiprocessing ---> 19 multiprocessing.set_start_method('fork') 20 log.info("We set multiprocessing start method as 'fork' by default.") 21 import copy /opt/conda/envs/python35-paddle120-env/lib/python3.7/multiprocessing/context.py in set_start_method(self, method, force) 240 def set_start_method(self, method, force=False): 241 if self._actual_context is not None and not force: --> 242 raise RuntimeError('context has already been set') 243 if method is None and force: 244 self._actual_context = None RuntimeError: context has already been set

opened by limengyuan988 7
真实应用场景实践问题

请教一下。案例背景：P2P 中寻找欺诈黑户，黑户1 ，正常客户 0 用异构图RGCN 节点：信用卡，用户，IP，设备关系：用户对信用卡，用户对ID，用户对设备，用户对用户特征：用户：金额，其他节点没有特征

因为我是要预测客户，所以其他节点是没有标签，也没有特征，我应该用nan还是0代替其他节点的特征和标签？

opened by yangnianen 5

RuntimeError: [operator < uniform_random > error]

https://pgl.readthedocs.io/en/stable/quick_start/instruction.html 初学paddle，运行上面pgl示例报错，麻烦大佬给看下？

版本、环境信息

PGL和PaddlePaddle版本号： PGL 2.2.2，PaddlePaddle 2.2.2
系统环境： Linux，Python 3.7.7

复现信息：

RuntimeError                              Traceback (most recent call last)
<ipython-input-49-dfd13c23041d> in <module>
      3 g = g.tensor()
      4 y = paddle.to_tensor(y)
----> 5 gcn = GCN(16, 2)
      6 criterion = paddle.nn.loss.CrossEntropyLoss()
      7 optim = Adam(learning_rate=0.01,

<ipython-input-47-cc40fc62b6b1> in __init__(self, input_size, num_class, num_layers, hidden_size, **kwargs)
     21                         self.hidden_size,
     22                         activation="relu",
---> 23                         norm=True))
     24             else:
     25                 self.gcns.append(

/opt/conda/lib/python3.7/site-packages/pgl/nn/conv.py in __init__(self, input_size, output_size, activation, norm)
    196         self.input_size = input_size
    197         self.output_size = output_size
--> 198         self.linear = nn.Linear(input_size, output_size, bias_attr=False)
    199         self.bias = self.create_parameter(shape=[output_size], is_bias=True)
    200         self.norm = norm

/opt/conda/lib/python3.7/site-packages/paddle/nn/layer/common.py in __init__(self, in_features, out_features, weight_attr, bias_attr, name)
    160             attr=self._weight_attr,
    161             dtype=self._dtype,
--> 162             is_bias=False)
    163         self.bias = self.create_parameter(
    164             shape=[out_features],

/opt/conda/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py in create_parameter(self, shape, attr, dtype, is_bias, default_initializer)
    420             temp_attr = None
    421         return self._helper.create_parameter(temp_attr, shape, dtype, is_bias,
--> 422                                              default_initializer)
    423 
    424     @deprecated(

/opt/conda/lib/python3.7/site-packages/paddle/fluid/layer_helper_base.py in create_parameter(self, attr, shape, dtype, is_bias, default_initializer, stop_gradient, type)
    376                 type=type,
    377                 stop_gradient=stop_gradient,
--> 378                 **attr._to_kwargs(with_initializer=True))
    379         else:
    380             self.startup_program.global_block().create_parameter(

/opt/conda/lib/python3.7/site-packages/paddle/fluid/framework.py in create_parameter(self, *args, **kwargs)
   3135                 pass
   3136             else:
-> 3137                 initializer(param, self)
   3138         return param
   3139 

/opt/conda/lib/python3.7/site-packages/paddle/fluid/initializer.py in __call__(self, var, block)
    566                     "seed": self._seed
    567                 },
--> 568                 stop_gradient=True)
    569 
    570         else:

/opt/conda/lib/python3.7/site-packages/paddle/fluid/framework.py in append_op(self, *args, **kwargs)
   3165                                        kwargs.get("outputs", {}), attrs
   3166                                        if attrs else {},
-> 3167                                        kwargs.get("stop_gradient", False))
   3168         else:
   3169             from paddle.fluid.dygraph.base import param_guard

/opt/conda/lib/python3.7/site-packages/paddle/fluid/dygraph/tracer.py in trace_op(self, type, inputs, outputs, attrs, stop_gradient)
     43         self.trace(type, inputs, outputs, attrs,
     44                    framework._current_expected_place(), self._has_grad and
---> 45                    not stop_gradient)
     46 
     47     def train_mode(self):

RuntimeError:   [operator < uniform_random > error]

opened by KC-Dumper 5

使用PGL在CPU和GPU上的结果差很多

PGL 2.1.5，PaddlePaddle2.1.0 代码地址：https://aistudio.baidu.com/aistudio/projectdetail/2246354 在ai studio上CPU结果在0.72左右，在GPU版本报错，在本地GPU上结果是0.2左右。本地GPU结果：在0.15-0.21之间波动 ai studio cpu 结果： ai studio gpu结果：

opened by zhangpu1211 5
关于调用pgl.Graph.tensor()后计算adj_dst_index报错的问题

您好，我在使用pgl.Graph构建一个存储为numpy.ndarray的图数据之后，调用.tensor()方法将图中的ndarray转换为paddle.Tensor时，会报出如下错误：

进一步观察发现，没转换为tensor之前的Graph中是包含adj_dst_index属性的

然而转换之后，在计算这个属性的degree属性时，paddle.scatter()会报错

同时还发现，如果我分两次分别执行创建Graph以及Graph.tensor()转换(例如在ipython中单步执行两次)，那么转换不会出问题；如果我将这两步操作一起执行，那么在计算adj_dst_index时就会出错。

opened by lzl19971215 5

graphsage的fleet分布式的问题

hi, 我跑了PGL提供的graphsage的demo，可以正常跑，然后把本地的程序改成了fleet的分布式。网络结构和超参数都没有变，启动一个pserver和一个worker，发现fleet的分布式程序loss不降，请问事什么问题。下面是我跑本地版graphsage的log 这是我fleet分布式的log 下面是main函数的分布式部分，我只修改了main函数

def main(args):
    data = load_data(args.normalize, args.symmetry)
    log.info("preprocess finish")
    log.info("Train Examples: %s" % len(data["train_index"]))
    log.info("Val Examples: %s" % len(data["val_index"]))
    log.info("Test Examples: %s" % len(data["test_index"]))
    log.info("Num nodes %s" % data["graph"].num_nodes)
    log.info("Num edges %s" % data["graph"].num_edges)
    log.info("Average Degree %s" % np.mean(data["graph"].indegree()))

    place = fluid.CUDAPlace(0) if args.use_cuda else fluid.CPUPlace()
    train_program = fluid.default_main_program()
    startup_program = fluid.default_startup_program()
    samples = []
    if args.samples_1 > 0:
        samples.append(args.samples_1)
    if args.samples_2 > 0:
        samples.append(args.samples_2)

    with fluid.program_guard(train_program, startup_program):
        feature, feature_init = paddle_helper.constant(
            "feat",
            dtype=data['feature'].dtype,
            value=data['feature'],
            hide_batch_size=False)

        graph_wrapper = pgl.graph_wrapper.GraphWrapper(
            "sub_graph", place, node_feat=data['graph'].node_feat_info())
        model_loss, model_acc = build_graph_model(
            graph_wrapper,
            num_class=data["num_class"],
            feature=feature,
            hidden_size=args.hidden_size,
            graphsage_type=args.graphsage_type,
            k_hop=len(samples))

    test_program = train_program.clone(for_test=True)
    
    trainer_id = int(os.environ["PADDLE_TRAINER_ID"])
    trainers = int(os.environ["PADDLE_TRAINERS"])
    training_role = os.environ["PADDLE_TRAINING_ROLE"]
    log.info(training_role )
    training_role = role_maker.Role.WORKER if training_role == "TRAINER" else role_maker.Role.SERVER
    ports = os.getenv("PADDLE_PSERVER_PORTS")
    pserver_ip = os.getenv("PADDLE_PSERVER_IP", "")
    pserver_endpoints = []
    for port in ports.split(","):
        pserver_endpoints.append(':'.join([pserver_ip, port]))
    role = role_maker.UserDefinedRoleMaker(current_id=trainer_id, role=training_role, worker_num=trainers, server_endpoints=pserver_endpoints)
    config = DistributeTranspilerConfig()
    config.sync_mode = True

    fleet.init(role)
    optimizer = fluid.optimizer.SGD(learning_rate=args.lr)
    optimizer = fleet.distributed_optimizer(optimizer, config)
    optimizer.minimize(model_loss)

    exe = fluid.Executor(place)

    if fleet.is_server():
        log.info('running server')
        fleet.init_server()
        fleet.run_server()

    if fleet.is_worker():
        log.info('running worker')
        fleet.init_worker()
        exe.run(fleet.startup_program)
        feature_init(place)

opened by shuoyin 5

在aistudio上，用Graph4Rec跑自己的数据异常

按照数据格式生成了文件，起了两个ip，环境是aistudio 32gb显存，pip install pgl 配置文件如下

# configuration for multi-metapath2vec

task_name: metapath2vec.0712

# ------------------------Data Configuration--------------------------------------------#
etype2files: "item2other:/home/aistudio/data/edges/item_other.txt,other2item:/home/aistudio/data/edges/other_item.txt,other2other:/home/aistudio/data/edges/other_other.txt"
ntype2files: "item:/home/aistudio/data/nodes/item_other_types.txt,other:/home/aistudio/data/nodes/item_other_types.txt"
symmetry: False
shard_num: 1000
# [ntype, name, feat_type, length]
nfeat_info: null
slots: []

meta_path: "item2other-other2item;item2other-other2other-other2item;other2item-item2other-other2item;other2item-item2other-other2other-other2item"


walk_len: 24
win_size: 3
neg_num: 10
walk_times: 10


# -----------------Model HyperParams Configuration--------------------------------------#
dataset_type: WalkBasedDataset
collatefn: CollateFn
model_type: WalkBasedModel
warm_start_from: null
num_nodes: 13806619
embed_size: 128
hidden_size: 128

# ----------------------Training Configuration------------------------------------------#
epochs: 100
num_workers: 1
lr: 0.001
lazy_mode: True
batch_node_size: 20
batch_pair_size: 100
pair_stream_shuffle_size: 10000
log_dir: /home/aistudio/logs_custom
output_dir: /home/aistudio/outputs_custom
save_dir: /home/aistudio/ckpt_custom
files2saved: ["*.yaml", "*.py", "*.sh", "./models", "./datasets", "./utils"]
log_steps: 100

# -------------Distributed CPU Training Configuration-----------------------------------#

# if you want to save model per epoch, then save_steps will be set by below equation
# save_steps = num_nodes * walk_len * win_size * walk_times / batch_pair_size / worker_num
# but the equation is not very precise since the neighbors of each node is not the same.
save_steps: 100000

启动命令 !python PGL-main/apps/Graph4Rec/env_run/src/train.py --config /home/aistudio/metapath2vec.yaml --ip /home/aistudio/ip_list.txt

报错信息如下

backup ./metapath2vec.yaml to /home/aistudio/logs_custom/metapath2vec.0712
[INFO] 2022-07-15 23:20:03,813 [    train.py:  134]:	=========================================================================
[INFO] 2022-07-15 23:20:03,813 [    train.py:  137]:	task_name: metapath2vec.0712
[INFO] 2022-07-15 23:20:03,813 [    train.py:  137]:	etype2files: item2other:/home/aistudio/data/edges/item_other.txt,other2item:/home/aistudio/data/edges/other_item.txt,other2other:/home/aistudio/data/edges/other_other.txt
[INFO] 2022-07-15 23:20:03,813 [    train.py:  137]:	ntype2files: item:/home/aistudio/data/nodes/item_other_types.txt,other:/home/aistudio/data/nodes/item_other_types.txt
[INFO] 2022-07-15 23:20:03,813 [    train.py:  137]:	symmetry: False
[INFO] 2022-07-15 23:20:03,813 [    train.py:  137]:	shard_num: 1000
[INFO] 2022-07-15 23:20:03,813 [    train.py:  137]:	nfeat_info: None
[INFO] 2022-07-15 23:20:03,813 [    train.py:  137]:	slots: []
[INFO] 2022-07-15 23:20:03,813 [    train.py:  137]:	meta_path: item2other-other2item;item2other-other2other-other2item;other2item-item2other-other2item;other2item-item2other-other2other-other2item
[INFO] 2022-07-15 23:20:03,813 [    train.py:  137]:	walk_len: 24
[INFO] 2022-07-15 23:20:03,813 [    train.py:  137]:	win_size: 3
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	neg_num: 10
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	walk_times: 10
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	dataset_type: WalkBasedDataset
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	collatefn: CollateFn
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	model_type: WalkBasedModel
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	warm_start_from: None
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	num_nodes: 13806619
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	embed_size: 128
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	hidden_size: 128
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	epochs: 100
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	num_workers: 1
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	lr: 0.001
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	lazy_mode: True
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	batch_node_size: 20
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	batch_pair_size: 100
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	pair_stream_shuffle_size: 10000
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	log_dir: /home/aistudio/logs_custom/metapath2vec.0712
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	output_dir: /home/aistudio/outputs_custom/metapath2vec.0712
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	save_dir: /home/aistudio/ckpt_custom/metapath2vec.0712
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	files2saved: ['*.yaml', '*.py', '*.sh', './models', './datasets', './utils']
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	log_steps: 100
[INFO] 2022-07-15 23:20:03,814 [    train.py:  137]:	save_steps: 100000
[INFO] 2022-07-15 23:20:03,814 [    train.py:  139]:	=========================================================================
W0715 23:20:03.816249  2101 gpu_context.cc:278] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 10.1
W0715 23:20:03.820690  2101 gpu_context.cc:306] device: 0, cuDNN Version: 7.6.
[INFO] 2022-07-15 23:20:07,897 [    train.py:  107]:	starting training...
[INFO] 2022-07-15 23:20:07,899 [  dataset.py:   83]:	gpu train data generator
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/pgl/distributed/helper.py:60: UserWarning: node_batch_stream_shuffle_size attribute is not existed, return None
  warnings.warn("%s attribute is not existed, return None" % attr)
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/pgl/distributed/dist_graph.py:180: UserWarning: node_batch_stream_shuffle_size is not specified, default value is 20000
  warnings.warn("node_batch_stream_shuffle_size is not specified, "
I0715 23:20:07.900080  2140 graph_py_service.cc:102] start to build server
I0715 23:20:07.900151  2140 graph_py_service.cc:112] build server done
/home/aistudio/PGL-main/apps/Graph4Rec/env_run/src/utils/config.py:83: UserWarning: sample_num_list attribute is not existed, return None
  warnings.warn("%s attribute is not existed, return None" % attr)
[INFO] 2022-07-15 23:20:07,911 [ego_graph.py:  198]:	sample_num_list is None
/home/aistudio/PGL-main/apps/Graph4Rec/env_run/src/utils/config.py:83: UserWarning: sage_mode attribute is not existed, return None
  warnings.warn("%s attribute is not existed, return None" % attr)
Traceback (most recent call last):
  File "PGL-main/apps/Graph4Rec/env_run/src/train.py", line 142, in <module>
    main(config, args.ip)
  File "PGL-main/apps/Graph4Rec/env_run/src/train.py", line 108, in main
    train(config, model, train_loader, optim)
  File "PGL-main/apps/Graph4Rec/env_run/src/train.py", line 68, in train
    optim.step()
  File "<decorator-gen-252>", line 2, in step
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/base.py", line 299, in __impl__
    return func(*args, **kwargs)
  File "<decorator-gen-250>", line 2, in step
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
    return wrapped_func(*args, **kwargs)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py", line 434, in __impl__
    return func(*args, **kwargs)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/optimizer/adam.py", line 451, in step
    loss=None, startup_program=None, params_grads=params_grads)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/optimizer/optimizer.py", line 963, in _apply_optimize
    optimize_ops = self._create_optimization_pass(params_grads)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/optimizer/optimizer.py", line 767, in _create_optimization_pass
    param_and_grad)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/optimizer/adam.py", line 351, in _append_optimize_op
    'beta2', _beta2, 'multi_precision', find_master)
OSError: (External) CUDA error(700), an illegal memory access was encountered. 
  [Hint: 'cudaErrorIllegalAddress'. The device encountered a load or store instruction on an invalid memory address. This leaves the process in an inconsistentstate and any further CUDA work will return the same error. To continue using CUDA, the process must be terminated and relaunched. ] (at /paddle/paddle/phi/backends/gpu/gpu_context.cc:624)
  [operator < adam > error]

opened by zouhan6806504 4

请问静态图中如何使用PGL，在TIPC中运行to_static时遇到了几个问题？

背景：百度论文复现赛中一篇模型需要PGL。在使用TIPC时，需要转换成静态图，出现了几个问题。（动态图中没有问题）

问题1: pgl.Graph.disjoint方法，报错信息图片

使用：
graphs = pgl.Graph.batch(graphs).tensor()

报错：
/site-packages/pgl/graph.py line 1208
   if len(edges) > 0:
TypeError: object of type 'Variable' has no len()

问题2: pgl.Graph.send方法，报错信息图片

使用：
msg = graph.send(self._send_func, ...)

报错：
/site-packages/pgl/graph.py line 264
   num_nodes = self.num_nodes.numpy()
'numpy' only can be called by 'paddle.Tensor' in dynamic graph mode

opened by BamLubi 4

demo跑不通
欢迎您反馈PGL使用问题，非常感谢您对PGL的贡献！在留下您的问题时，辛苦您同步提供如下信息：

版本、环境信息

PGL和PaddlePaddle版本号：

系统环境：

linux python 3.8

复现信息：

Unsupervised GraphSAGE in PGL python train.py --data_path ./sample.txt --num_nodes 2000 --phase train demo例子跑不通，缺失包 AttributeError: module 'pgl' has no attribute 'graph_wrapper'
opened by yifenzhong1920 4
PGL Box 报错

mag240m 的数据很大 24g，运行到 STAGE [GPU Load] end load edge into GPU, type[inst2author] 后就直接退出了，也没有报错，开了debug 也看不到报错日志，我已经拉了最新的main 分支代码，是代码没推完整吗，还是数据量太大了，我显卡跑不了，我机器是2张卡，没看到哪里配置说明是多卡训练，不支持多机多卡么，目前看只是docker 里单机运行。后面是否能提供在k8s 上可以跑的 yarm 环境，支持多机多卡。我没有百度一体机没有1机8卡的设备。需要多机多卡运行。 mag240m数据量太大了，能否提供小数据集快速验证的。可以提供一份没有 sharding_tool 之前的，测试下 sharding 到运行的整体流程。谢谢。

opened by xbinglzh 3

Graph4KG的ComplEx跑验证集时Out of Memory

使用ai studio上的32g v100在OpenBG500数据集上跑RotatE模型，训练集正常，测试集无论bs调多少都会OOM。

我的代码:

!python -u train.py --model_name RotatE \
                    --data_name  OpenBG500\
                    --data_path  /home/aistudio/data/\
                    --save_path /home/aistudio/result/Rotate --max_steps 1\
                    --batch_size 1 --log_interval 1000 --eval_interval 20000 --reg_coef 1e-7 --reg_norm 3 \
                    --neg_sample_size 256 --neg_sample_type 'chunk' --embed_dim 200 --gamma 12.0 --lr 0.018 --optimizer adagrad -adv \
                    --num_workers 2 --num_epoch 30 --print_on_screen --filter_eval --neg_deg_sample --valid

因为一直报错，我就把max_steps设为1，事实上模型的训练很正常，但是测试的时候会oom

----------------------------------------
        Device Setting        
----------------------------------------
 Entity   embedding place: gpu
 Relation embedding place: gpu
----------------------------------------
----------------------------------------
       Embedding Setting      
----------------------------------------
 Entity   embedding dimension: 400
 Relation embedding dimension: 200
----------------------------------------
2022-12-03 20:54:31,717 INFO     seed                :0
2022-12-03 20:54:31,718 INFO     data_path           :/home/aistudio/data/
2022-12-03 20:54:31,718 INFO     save_path           :/home/aistudio/result/Rotate/rotate_OpenBG500_d_200_g_12.0_e_gpu_r_gpu_l_Logsigmoid_lr_0.018_0.1_KGE
2022-12-03 20:54:31,718 INFO     init_from_ckpt      :None
2022-12-03 20:54:31,718 INFO     data_name           :OpenBG500
2022-12-03 20:54:31,718 INFO     use_dict            :False
2022-12-03 20:54:31,718 INFO     kv_mode             :False
2022-12-03 20:54:31,718 INFO     batch_size          :1
2022-12-03 20:54:31,718 INFO     test_batch_size     :16
2022-12-03 20:54:31,718 INFO     neg_sample_size     :256
2022-12-03 20:54:31,718 INFO     filter_eval         :True
2022-12-03 20:54:31,718 INFO     model_name          :rotate
2022-12-03 20:54:31,718 INFO     embed_dim           :200
2022-12-03 20:54:31,718 INFO     reg_coef            :1e-07
2022-12-03 20:54:31,718 INFO     loss_type           :Logsigmoid
2022-12-03 20:54:31,718 INFO     max_steps           :1
2022-12-03 20:54:31,718 INFO     lr                  :0.018
2022-12-03 20:54:31,718 INFO     optimizer           :adagrad
2022-12-03 20:54:31,718 INFO     cpu_lr              :0.1
2022-12-03 20:54:31,718 INFO     cpu_optimizer       :adagrad
2022-12-03 20:54:31,719 INFO     mix_cpu_gpu         :False
2022-12-03 20:54:31,719 INFO     async_update        :False
2022-12-03 20:54:31,719 INFO     valid               :True
2022-12-03 20:54:31,719 INFO     test                :False
2022-12-03 20:54:31,719 INFO     task_name           :KGE
2022-12-03 20:54:31,719 INFO     num_workers         :2
2022-12-03 20:54:31,719 INFO     neg_sample_type     :chunk
2022-12-03 20:54:31,719 INFO     neg_deg_sample      :True
2022-12-03 20:54:31,719 INFO     neg_adversarial_sampling:True
2022-12-03 20:54:31,719 INFO     adversarial_temperature:1.0
2022-12-03 20:54:31,719 INFO     filter_sample       :False
2022-12-03 20:54:31,719 INFO     valid_percent       :1.0
2022-12-03 20:54:31,719 INFO     use_feature         :False
2022-12-03 20:54:31,719 INFO     reg_type            :norm_er
2022-12-03 20:54:31,719 INFO     reg_norm            :3
2022-12-03 20:54:31,719 INFO     weighted_loss       :False
2022-12-03 20:54:31,719 INFO     margin              :1.0
2022-12-03 20:54:31,719 INFO     pairwise            :False
2022-12-03 20:54:31,719 INFO     gamma               :12.0
2022-12-03 20:54:31,719 INFO     ote_scale           :0
2022-12-03 20:54:31,719 INFO     ote_size            :1
2022-12-03 20:54:31,719 INFO     quate_lmbda1        :0.0
2022-12-03 20:54:31,719 INFO     quate_lmbda2        :0.0
2022-12-03 20:54:31,719 INFO     num_epoch           :30
2022-12-03 20:54:31,719 INFO     scheduler_interval  :-1
2022-12-03 20:54:31,720 INFO     num_process         :1
2022-12-03 20:54:31,720 INFO     print_on_screen     :True
2022-12-03 20:54:31,720 INFO     log_interval        :1000
2022-12-03 20:54:31,720 INFO     save_interval       :-1
2022-12-03 20:54:31,720 INFO     eval_interval       :20000
2022-12-03 20:54:31,720 INFO     ent_emb_on_cpu      :False
2022-12-03 20:54:31,720 INFO     rel_emb_on_cpu      :False
2022-12-03 20:54:31,720 INFO     use_embedding_regularization:True
2022-12-03 20:54:31,720 INFO     ent_dim             :400
2022-12-03 20:54:31,720 INFO     rel_dim             :200
2022-12-03 20:54:31,720 INFO     num_chunks          :1
W1203 20:54:51.171296 20466 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W1203 20:54:51.174912 20466 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py:3983: DeprecationWarning: Op `adagrad` is executed through `append_op` under the dynamic mode, the corresponding API implementation needs to be upgraded to using `_C_ops` method.
  DeprecationWarning,
2022-12-03 20:54:54,271 INFO     [evaluation] start...
  0%|                                                   | 0/313 [00:00<?, ?it/s]terminate called after throwing an instance of 'paddle::memory::allocation::BadAlloc'
  what():  

--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0   multiply_ad_func(paddle::experimental::Tensor const&, paddle::experimental::Tensor const&)
1   paddle::experimental::multiply(paddle::experimental::Tensor const&, paddle::experimental::Tensor const&)
2   void phi::MultiplyRawKernel<float, phi::GPUContext>(phi::GPUContext const&, phi::DenseTensor const&, phi::DenseTensor const&, int, phi::DenseTensor*)
3   float* phi::DeviceContext::Alloc<float>(phi::TensorBase*, unsigned long, bool) const
4   phi::DeviceContext::Impl::Alloc(phi::TensorBase*, phi::Place const&, paddle::experimental::DataType, unsigned long, bool) const
5   phi::DenseTensor::AllocateFrom(phi::Allocator*, paddle::experimental::DataType, unsigned long)
6   paddle::memory::allocation::StatAllocator::AllocateImpl(unsigned long)
7   paddle::memory::allocation::Allocator::Allocate(unsigned long)
8   paddle::memory::allocation::Allocator::Allocate(unsigned long)
9   paddle::memory::allocation::Allocator::Allocate(unsigned long)
10  paddle::memory::allocation::CUDAAllocator::AllocateImpl(unsigned long)
11  std::string phi::enforce::GetCompleteTraceBackString<std::string >(std::string&&, char const*, int)
12  phi::enforce::GetCurrentTraceBackString[abi:cxx11](bool)

----------------------
Error Message Summary:
----------------------
ResourceExhaustedError: 

Out of memory error on GPU 0. Cannot allocate 2.977216GB memory on GPU 0, 29.256836GB memory has been allocated and available memory is only 2.491699GB.

Please check whether there is any other process using GPU 0.
1. If yes, please stop them, or start PaddlePaddle on another GPU.
2. If no, please decrease the batch size of your model. 
If the above ways do not solve the out of memory problem, you can try to use CUDA managed memory. The command is `export FLAGS_use_cuda_managed_memory=false`.
 (at /paddle/paddle/fluid/memory/allocation/cuda_allocator.cc:95)



--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0   paddle::pybind::ThrowExceptionToPython(std::__exception_ptr::exception_ptr)

----------------------
Error Message Summary:
----------------------
FatalError: `Process abort signal` is detected by the operating system.
  [TimeInfo: *** Aborted at 1670072104 (unix time) try "date -d @1670072104" if you are using GNU date ***]
  [SignalInfo: *** SIGABRT (@0x3e800004ff2) received by PID 20466 (TID 0x7f2dee9e2700) from PID 20466 ***]

opened by StuPidMRE 2

Bump paddlepaddle from 2.0.0rc1 to 2.4.0 in /docs
Bumps paddlepaddle from 2.0.0rc1 to 2.4.0.

Release notes

Sourced from paddlepaddle's releases.

PaddlePaddle 2.4.0 Release Note

2.4.0 Release Note

1. 重要更新

新动态图架构正式生效：新动态图框架调大幅提升了调度性能，超90%API的调度性能提升超过50%，超50%套件模型性能提升超过5%，功能架构更加清晰，二次开发能力和体验显著增强。

全面提升了飞桨的动静统一能力： 动转静功能提供了更加丰富的Python语法支持，飞桨的Python语法覆盖率达到90%，对语法转写逻辑进行了重点地优化，完备地支持了控制流语法，提供了更加流畅的一键转静态图体验；借助全新升级的静态图执行器，让动转静训练具有更优的加速能力，重点模型测试显示接近静态图最佳水平；提升了动转静的可扩展性，新增支持多函数合并导出和推理，支持用户使用PHI算子库进行二次开发和灵活部署，有效支撑语音领域U2++特色模型的自定义解码。

新增稀疏计算类API： 新增55个稀疏API paddle.sparse.*，支持稀疏计算主流场景，已应用于3D点云目标检测、Sparse Transformers等任务的稀疏训练和推理部署，高稀疏度场景下相比使用DenseTensor提速105.75%，相比同类产品稀疏计算提速4.01%~58.55%；支持多种稀疏Tensor(SparseCoo 和 SparseCsr等)的计算，极致节省显存；同时保持了一致的使用体验，和稠密Tensor的API使用方式一致。

大规模图神经网络GPU训练引擎： 通过SSD、内存、显存的异构层次化存储技术，突破显存瓶颈,支持超大规模图的全GPU存储和训练；实现了游走、采样、训练的全GPU一体化解决方案，相比传统的分布式CPU解决方案，相同成本的情况下训练速度提升10+倍。

环境适配： 新增了适配CUDA11.7 版本的预编译安装包，新增了支持在Ubuntu 22.04及以上版本中运行。

前瞻性预告

飞桨框架将在2.5版本废弃对python 3.6的支持。

飞桨框架将会逐步废弃python端的paddle.fluild命名空间下的API，在2.5版本时，部分该命名空间下的API将会被直接删除。

2. 不兼容升级

取消了适配CUDA10.1 版本的预编译安装包。

Tensor.clear_gradient(bool set_to_zero)接口不再接收kwargs传入的值，只能通过args传入set_to_zero的bool变量。

为了提高显存利用效率，动态图默认仅保留前向叶子结点变量的梯度如训练中网络参数的梯度，而不再支持默认保留非叶子结点的梯度。如果需要保留特定Tensor的梯度，可以在反向执行前调用Tensor.retain_grads()接口。

paddle.autograd.PyLayer将不再支持输入是tuple的情况，如果输入希望是一组Tensor的情况请传入list of Tensor。

3. 训练框架（含分布式）

（1）新增API和增强API功能

新增稀疏计算类API：paddle.sparse

新增55个稀疏API，支持稀疏计算主流场景，已应用于3D点云目标检测、Sparse Transformers等任务的稀疏训练和推理部署，高稀疏度场景下相比使用DenseTensor提速105.75%，相比同类产品稀疏计算提速4.01%~58.55%；支持多种稀疏Tensor(SparseCoo 和 SparseCsr等)的计算，极致节省显存；同时保持了一致的使用体验，和稠密Tensor的API使用方式一致。#45849, #46694, #45086, #41857, #42935, #43475, #43668, #43966, #44022, #44346, #44432, #44451, #44743, #42013, #43520, #41434, #42130, #41276, #41857, #41356

新增语音领域API： paddle.audio

新增MFCC、Spectrogram、LogMelSpectrogram等特征提取API，支持GPU计算，相比CPU实现处理性能提升 15x 倍以上，可大幅提升语音模型训练GPU利用率。#45424

新增窗函数、离散余弦变换等特征提取基础API，方便用户自定义语音特征提取。#45424

新增语音 IO 模块，提供2种音频I/O backend，支持6种编解码，便捷地实现语音数据的加载。 #45939

新增TESS，ESC50语音分类数据集，方便用户完成经典语音分类模型。#45939

新增图学习领域API： paddle.geometric

图学习逐渐成为机器学习领域的关键技术，飞桨新增paddle.geometric模块提供更好的图学习建模和训练开发体验。

消息传递：图学习消息传递机制是图建模的基础，因此新增7个图学习消息传递API，更方便完成进行图学习建模。其中，新增的3个消息传递融合算子可大幅减少图模型训练显存占用，稠密图场景下GCN系列模型可节省50%+显存，训练速度可提升20%+。#44848, #44580, #43174, #44970

图采样：图采样是图模型训练的性能瓶颈，此次新增了高性能图采样算子，支持高并发图采样，GraphSage的采样速度可提升32倍以上，模型训练速度可提升12倍以上。#44970

新增视觉领域API

paddle.vision新增目标检测领域算子paddle.vision.distribute_fpn_proposals(#43736), paddle.vision.generate_proposals(#43611), paddle.vision.matrix_nms(#44357), paddle.vision.prior_box和paddle.vision.box_coder(#47282)。

新增其他API

新增iinfo(#45321), count_nonzero(#44169), nanmedian(#42385), remainder_ (#45266), take(#44741), triu_indices(#45168), sgn(#44568), bucketize(#44195), nanquantile(#41343), frac(#41226), logcumsumexp(#42267), pairwise_distance(#44161), heaviside(#41872), logspace(#41261), corrcoef(#40690)

新增RReLU(#41823), CyclicLR(#40698), OneCycleLR(#41825), Softmax2D(#40910), SoftMarginLoss(#42364), MultiLabelSoftMarginLoss(#41183), TripletMarginLoss(#40487), TripletMarginWithDistanceLoss(#40545), CosineEmbeddingLoss和cosine_embedding_loss(#41680), PixelUnshuffle(#40728), ChannelShuffle(#40743)

增强API功能

增加BatchNorm1D的大batch_size计算功能 #43072

完善集合通信分布式训练API

... (truncated)

Commits

3fa7a73 Fix mac link python (#48017)

5033b6c Fix slice bugs in MKLDNN when input dims are zeros (#46671) (#47887)

4465ba2 rename fw_bw func name of interleave pp (#47571) (#47862)

3a6cc57 Fuse multi transformer layer pass (#47541) (#47830)

2e9e65d 【cherry-pick】update Recompute doc (#47784)

ff642c6 [Cherry-pick] Fix python link error (#47811)

76b883c 【Cherry-pick PR47743】change cudnn error to cuda error if compiled cuda versio...

51248f8 [Cherry-pick] remove functions not belong to public-api from all (#47577)

ea5f44b [cherry-pick] Squeeze2 and transpose2 fuse using oneDNN(#47712)

34f67a8 add fuse_multi_transformer passes to fp16. test=develop (#47733)

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 1
Erniesage paddle2.0版本的看到NLP下只提供了ErnieSage V2: 作用在text graph的边上;的请问V1 3 4的2.x版本有开源嘛

erniesage paddle2.0版本的看NLP下只提供了V2版本的请问V1 3 4的有嘛

https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/text_graph/erniesage

对这个比较感兴趣，看到了1.8版本的v1 v2 v3 的实现，官网目前给到了高阶邻居的图示

opened by dingidng 0

Releases(2.2.4)

2.2.4(Aug 10, 2022)
版本小更新：

修复与paddlepaddle一些依赖上的冲突。

Source code(tar.gz)
Source code(zip)
2.2.3post0(Apr 21, 2022)
版本小更新

修改了框架动静图切换in_dygraph_mode的逻辑，兼容Paddle 2.3版本

Source code(tar.gz)
Source code(zip)
v2.2.2(Dec 29, 2021)
1、发布Graph4Rec：一个用于推荐系统的通用大规模分布式图表示学习工具包（详情请参考这里）。

全面丰富：预置丰富的图模型，包括walk-based和GNN-based 模型。

灵活易用：配置即模型，无需改动代码，一键即可启动单机和多机训练。

大规模：大规模参数服务与图引擎无缝结合，可进行工业级图表示学习训练，实现应用落地。

2、发布Graph4KG：一个用于大规模知识图谱表示学习的工具包（详情请参考这里）。

异步数据读取：数据读取与模型训练异步进行，节省数据IO时间。

异构存储与计算：实体和关系参数存储在CPU，模型计算在GPU，节省显存占用。

异步梯度更新：GPU到CPU的梯度更新异步执行，进一步提升训练速度。

OGB-LSC支持：提供了WikiKG90M数据集在KDD CUP2021中的解决方案。

3、发布GNNAutoScale：一个用于图学习异构设备训练的工具包（详情请参考这里）。

GNN模型自动拓展：采样中间隐态CPU缓存技术，支持GNN模型自动拓展。

解决IO瓶颈：支持CUDA Stream异步读写，实现计算和IO操作并行化。

速度提升显著：支持消息传递Spmm化，大幅度提升图学习的显存利用和训练速度。

提高数据生成效率：Dataloader数据缓存策略，提高数据生成效率。

4、新特性

新增 pgl.partition图分片方法，包含random_partition和metis_partition，其中metis_partition暂不支持 Windows系统；

新增pgl.utils.transform，包含一些方便的图转换API，如to_undirected用于转无向图，add_self_loops用于给图增加自环边；

新增pgl.utils.stream_pool，支持 CUDA Stream异步读写；

新增 pgl.utils.shared_embedding，支持MMAP模式的共享 Embedding。

5、Bug修复

修复 DistGPUGraph无法在 PaddlePaddle版本大于2.0.0正确运行的 Bug；

修复 Dataloader在 num_workers 大于1时，不支持python3.8 mac os环境下正确运行的 Bug；

Source code(tar.gz)
Source code(zip)
2.1.1(Feb 2, 2021)
PGL全面支持Paddle 2.0动态图

新增动态图模式的异构图建图, 支持异构图metapath采样与消息传递双机制, 详情请参考这里

新增实验性的DistGPUGraph，利用NCCL通讯实现多卡FullBatch神经网络训练，更多的示例可见此

全面更新动态图文档, 接口更加简洁, 使用更加方便. 详细文档可参考这里

Source code(tar.gz)
Source code(zip)
2.0.0a(Dec 18, 2020)

• （重要更新）基于Paddle 2.0rc对PGL进行重构，依赖Segment系列OP构建动态图版本的PGL，在保证效率的同时，大大提升了易用性。注意！！本升级为不兼容升级，静态图版本PGL（<1.3）将处于不升级维护状态； • 统一了旧版Graph、MultiGraph与GraphWrapper的概念为Graph，新增动态图PGL全新文档； • 新增动态图版本PGL Model Zoo，支持Deepwalk、GCN、GAT、SGC、APPNP、GraphSage、GIN、GCNII等8个模型，并支持分布式能力。更多模型将会在后续版本发布； • 新增DataLoader支持多进程图采样能力； • 新增图神经网络7日打卡营课程资料；
Source code(tar.gz)
Source code(zip)
1.1(Apr 29, 2020)

• 发布业界首个结合语义信息与结构信息的图神经网络模型ERNIESage • 新增PGL-KE，目前PGL涵盖游走类、消息传递类以及知识嵌入类等25+图学习模型 • 新增Graph Batch、Graph Pooling等Graph类操作算子 • 全面支持Open Graph Benchmark基准测试集 • Model Zoo新增ERNIESage、MetaPath2Vec++、Mulit-MetaPath2Vec++、STGCN、GIN、PinSage模型
Source code(tar.gz)
Source code(zip)
1.0(Oct 30, 2019)
发布飞桨图学习框架PGL v1.0正式版：

易用性：新增异构图的Metapath采样与Message Passing消息传递双机制，支持包含多种类型节点和边特征的异构图建模，新增Metapath2vec、GATNE等异构图算法。同时，文档、API、Tutorial等材料也进一步完善。

规模性：新增分布式图引擎和分布式Embedding，可支持十亿节点百亿边的超巨图的多种分布式训练模式。新增distributed deepwalk和distributed graphSage两个分布式样例。

丰富性：新增8个、累计13个图学习模型，涵盖了图神经网络和图表征学习的主流模型。新增的8个模型分别是LINE、struc2vec、metapath2vec、GES、GATNE、SGC、Unsup-GraphSage、DGI。

更新首页README。

Source code(tar.gz)
Source code(zip)
0.1.0b0(Jun 24, 2019)

图学习框架PGL（Paddle Graph Learning）preview版本 • 正式发布基于PaddlePaddle的图学习框架PGL，提供基于游走（Walk Based）以及消息传递（Message Passing）两种计算范式去搭建最前沿的图学习算法，如图表征学习、图神经网络等。PGL充分利用Paddle LoD Tensor特性大幅提升Message-Passing范式中信息聚合效率，兼顾了灵活性和高效性。 • 新增基于PGL复现的GCN、GAT，在多个数据集达到SOTA水平。 • 新增基于大规模子图采样模型Graphsage模型，单机可支持5千万节点、20亿条边的巨图。 • 新增node2vec，deepwalk等图表征学习方法，达到SOTA水平。 • 新增PGL文档、API、tutorial等易用性材料。
Source code(tar.gz)
Source code(zip)

Paddle Graph Learning (PGL) is an efficient and flexible graph learning framework based on PaddlePaddle

Related tags

Overview

Breaking News !!

Highlight: Flexibility - Natively Support Heterogeneous Graph Learning

Support meta path walk sampling on heterogeneous graph

Support Message Passing mechanism on heterogeneous graph

Large-Scale: Support distributed graph storage and distributed training algorithms

Model Zoo

System requirements

Installation

The Team

License

Comments

版本、环境信息

复现信息：

版本、环境信息

复现信息：

PaddlePaddle 2.4.0 Release Note

2.4.0 Release Note

1. 重要更新

前瞻性预告

2. 不兼容升级

3. 训练框架（含分布式）

（1）新增API和增强API功能

Releases(2.2.4)

2.2.4(Aug 10, 2022)

2.2.3post0(Apr 21, 2022)

v2.2.2(Dec 29, 2021)

2.1.1(Feb 2, 2021)

2.0.0a(Dec 18, 2020)

1.1(Apr 29, 2020)

1.0(Oct 30, 2019)

0.1.0b0(Jun 24, 2019)

Owner

[CVPR 2021] MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition

Models, datasets and tools for Facial keypoints detection

Flexible time series feature extraction & processing

利用Tensorflow实现基于CNN的中文短文本分类

Dcf-game-infrastructure-public - Contains all the components necessary to run a DC finals (attack-defense CTF) game from OOO

Robust Instance Segmentation through Reasoning about Multi-Object Occlusion [CVPR 2021]

Torchreid: Deep learning person re-identification in PyTorch.

Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel order of RGB and BGR. Simple Channel Converter for ONNX.

DCSAU-Net: A Deeper and More Compact Split-Attention U-Net for Medical Image Segmentation

Official Implementation of "Learning Disentangled Behavior Embeddings"

A Pytorch Implementation of [Source data‐free domain adaptation of object detector through domain

Enigma-Plus - Python based Enigma machine simulator with some extra features

The official implementation of "Rethink Dilated Convolution for Real-time Semantic Segmentation"

DeepLab resnet v2 model in pytorch

FLVIS: Feedback Loop Based Visual Initial SLAM

Python scripts for performing stereo depth estimation using the MobileStereoNet model in Tensorflow Lite.

An official implementation of MobileStyleGAN in PyTorch

A curated list of awesome projects and resources related fastai

Geometric Sensitivity Decomposition

End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)