240610_Thuy_Labseminar[Rethinking Tokenizer and Decoder in Masked Graph Modeling for Molecules].pptx

thanhdowork 70 views 14 slides Jun 24, 2024

Slide 1 of 14

About This Presentation

Rethinking Tokenizer and Decoder in Masked Graph Modeling for Molecules

Size: 2.29 MB

Language: en

Added: Jun 24, 2024

Slides: 14 pages

Slide Content

Rethinking Tokenizer and Decoder in Masked Graph Modeling for Molecules Van Thuy Hoang Network Science Lab Dept. of Artificial Intelligence The Catholic University of Korea E-mail: [email protected] 2024-06-10

BACKGROUND: Graph Convolutional Networks (GCNs) Key Idea: Each node aggregates information from its neighborhood to get contextualized node embedding. Limitation: Most GNNs focus on homogeneous graph. Neural Transformation Aggregate neighbor’s information

BACKGROUND: Motivation A large part of deep learning revolves around finding rich representations of unstructured data such as images, text and graphs. SSL on graphs

BACKGROUND: Motivation A architecture of Variational Graph AutoEncoder

Graph tokenizer Graph tokenizer: Given a graph G, the graph tokenizer employs a graph fragmentation function to break G into smaller subgraphs, such as nodes and motifs Then, these fragments are mapped into fixed-length tokens to serve as the targets being reconstructed later. Clearly, the granularity of graph tokens determines the abstraction level of representations in masked modelling

Preliminary: Masked Graph Modeling Three key steps: graph tokenizer, graph masking, and graph autoencoder Graph tokenizer A graph tokenizer tok(g) = {y_t = m(t) ∈ R d |t ∈ f(g)} to generate its graph tokens as the reconstruction targets. The tokenizer tok(·) is composed of a fragmentation function f that breaks g into a set of subgraphs Graph masking: Remask decoding masks the hidden representations of the masked nodes Vm again by a special token m1 Graph autoencoder

Preliminary: Revisiting Molecule Tokenizers Summarize the molecule tokenizers into four distinct categories

Pretrained GNN-based tokenizer (b) A motif-based tokenizer that applies the fragmentation functions of cycles and the remaining nodes. (c) A two-layer GIN-based tokenizer that extracts 2-hop rooted subtrees for every node in the graph.

Overview of the SimSGT’s framework. It applies the GTS architecture for both its encoder and decoder. SimSGT features a Simple GNN-based Tokenizer (SGT), and employs a new remask strategy to decouple the encoder and decoder of the GTS architecture ( GINE and GraphTrans )

Simple GNN-based Tokenizer SGT simplifies existing aggregation-based GNNs by removing the nonlinear update functions in GNN layers. It is inspired by studies showing that carefully designed graph operators can generate effective node representations

Experiments: Molecular property prediction Transfer learning

Experiments: Molecular property prediction Transfer learning performance for molecular property prediction (regression)

Conclusion and Future Works The roles of tokenizer and decoder in MGM for molecules A comprehensive range of molecule fragmentation functions as molecule tokenizers. The results reveal that a subgraph-level tokenizer gives rise to MRL performance. For future works, the potential application of molecule tokenizers to joint molecule-text modeling

240610_Thuy_Labseminar[Rethinking Tokenizer and Decoder in Masked Graph Modeling for Molecules].pptx

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

240610_Thuy_Labseminar[Rethinking Tokenizer and Decoder in Masked Graph Modeling for Molecules].pptx

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx