240527_Thuy_Labseminar[Self-supervised Heterogeneous Graph Pre-training Based on Structural Clustering ].pptx

thanhdowork 84 views 13 slides Jun 03, 2024
Slide 1
Slide 1 of 13
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13

About This Presentation

Self-supervised Heterogeneous Graph Pre-training
Based on Structural Clustering


Slide Content

Self-supervised Heterogeneous Graph Pre-training Based on Structural Clustering Van Thuy Hoang Network Science Lab Dept. of Artificial Intelligence The Catholic University of Korea E-mail: [email protected] 2024-05-27

BACKGROUND: Graph Convolutional Networks (GCNs) Key Idea: Each node aggregates information from its neighborhood to get contextualized node embedding. Limitation: Most GNNs focus on homogeneous graph. Hoang Neural Transformation Aggregate neighbor’s information

Background Heterogeneous graphs can model complex systems Nodes are labeled with multiple types Edges between nodes have multiple relationships It’s flexible to capture rich semantic knowledge. Example: Academic Graph Example: LinkedIn Economic Graph       Paper Author Paper Cite W rite

Background: Challenges for handling heterogeneous graph Different types of nodes/edges have own feature distribution Node: Papers have text features, while authors have affiliation features Edge: Co-authorship is different with citation Graph can be dynamic and large-scale KDD in 1990 is more related to database, but it’s closer to machine learning in recent years The number of papers doubles every 12 years, and now reaches billion

Motivation Existing methods require high-quality positive and negative examples, limiting their flexibility and generalization ability. We propose a flexible framework SHGP, which does not need any positive examples or negative examples.

Overall Architecture four object types: “Paper” (P), “Author” (A), “Conference” (C) and “Term” (T) three relations: “Publish” between P and C, “Write” between P and A, and “Contain” between P and T APC is a meta-path of length two a1p2c2 is such a path instance, which means that author a1 has published paper p2 in conference c2

Overall Architecture There are 4 steps:

Att-HGNN Encoder Feature Projection : project different types of features into a common space. Object-level Aggregation : aggregate one-type of neighbors by adjacency matrix. Type-level Aggregation : aggregate different-types of neighbors by attention

Experimental Settings Datasets Four publicly available HIN benchmark datasets, which are widely used in previous related works

Experiments Object Classification

Background Object Classification

Conclusions SHGP, a novel heterogeneous graph pre-training framework SHGP does not require any positive examples or negative examples. SHGP enjoys a high degree of flexibility. The two modules are able to utilize and enhance each other, promoting the model to effectively learn informative embeddings. Different from existing SSL methods on HINs, our SHGP does not require any positive examples or negative examples, thereby enjoying a high degree of flexibility and ease of use.