241223_JW_labseminar[Neural Message Passing for Quantum Chemistry].pptx

thanhdowork 62 views 16 slides Dec 23, 2024
Slide 1
Slide 1 of 16
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16

About This Presentation

Neural Message Passing for Quantum Chemistry


Slide Content

Neural Message Passing for Quantum Chemistry Jin-Woo Jeong Network Science Lab Dept. of Mathematics The Catholic University of Korea E-mail: [email protected] Gilmer, Justin, et al. "Neural message passing for quantum chemistry."  International conference on machine learning . PMLR, 2017.

Introduction Message Passing Neural Networks QM9 Dataset MPNN Variants Input Representation Training Results Conclusions Q/A

Introduction Introduction Predicting molecular properties plays a critical role in fields like drug discovery and material science. Deep learning has become increasingly important for analyzing and making accurate predictions in this area. However, traditional methods heavily rely on feature engineering. Density Functional Theory (DFT) is a widely-used computational method for predicting molecular properties. DFT provides high accuracy but it require significant expensive computational cost. To address these challenges, this paper proposes a novel framework using Message Passing Neural Networks (MPNNs), offering a scalable and efficient alternative to traditional methods for predicting molecular properties.

Message Passing Neural Networks Message Passing: A general Framework Core Contribution: Establishes a unified framework for learning on molecular graphs. Operates on graph-structured data with nodes (atoms) and edges (bonds). Two Phases in MPNNs: Message Passing Phase: Exchanges information between nodes through edges. Readout Phase: Aggregates node-level information to make graph-level predictions. Significance: Unifies various existing graph-based neural networks under a common framework.

Message Passing Neural Networks Examples of Message Passing Models Convolutional Networks for Learning Molecular Fingerprints Interaction Networks Molecular Graph Convolutions Deep Tensor Neural Networks Laplacian Based Methods (GCN) Gated Graph Neural Networks (GG-NN) (baseline) (the same update function is used at each time step t) , and are neural network  

QM9 Dataset QM9 Dataset Composition Consists of 134,000 organic molecules. Includes atoms of Hydrogen (H), Carbon (C), Oxygen (O), Nitrogen (N), and Fluorine (F). Each molecule contains up to 9 heavy atoms (non-Hydrogen atoms). Data Provided Low-energy molecular geometry information. 13 molecular properties (e.g., energy, vibrational frequencies, electron distribution): Features Molecular properties are computed using Density Functional Theory (DFT). Includes both spatial and bonding information for molecules. Suitable for evaluating the performance of chemical prediction models. Purpose Used to benchmark machine learning and deep learning models for molecular property prediction. Aids in predicting physicochemical properties that are computationally expensive to calculate experimentally.

MPNN Variants Message Functions Matrix Multiplication Edge Network , Pair Message ,  

MPNN Variants Virtual Graph Elements Add Virtual Edge To make messages are passed throughout the model, adding a separate virtual edge type for pairs of nodes that are not connected. Master Node Master node is connected to every input node in the graph with a special edge type. They allow the master node to have a separate node dimension , as well as separate weights for the internal update function(GRU).  

MPNN Variants Readout Functions Original one , and are neural network set2set Linear projection to each tuple Input: After M steps of computation, set2set produces a graph level embedding They feed through a neural network to produce the output  

MPNN Variants Multiple Towers One issue of MPNN: scalability a single step of the message passing phase for a dense graph requires Solution: Multiple Towers Break the dimensional node embeddings into different dimensional embeddings Run a propagation step on each of copies separately to get temporary embeddings The temporary embeddings of each node are then mixed according to the equation time and copies  Ex) is faster 2 times than  

Input Representation Input representation Atomic Features : Each atom captures electronic properties and bond information. Hydrogen atoms were tested as explicit nodes, increasing the graph size (up to 29 nodes) but significantly slowing training (~10x). Adjacency Matrix Representations : 1. Chemical Graph : Encodes bond types (single, double, triple, aromatic). 2. Distance Bins : Bins bond distances into 10 intervals, combining bond type and spatial information. 3. Raw Distance Feature : 5-dimensional vectors with Euclidean distance and one-hot encoded bond types. These approaches aim to represent molecular structures while balancing complexity and computation.

Training Training Hyperparameter Search Random search with 50 trials. Number of message passing steps ( ): 3 to 8. set2set computations ( ): 1 to 12. Optimizer Adam optimizer. Loss function: MSE Evaluation Metric: MAE Ensembling Ensembles of the 5 models with the lowest validation error further improved test performance.  

Results Results

Results Results

Conclusion Conclusion They propose MPNNs, a unified framework for molecular property prediction that leverages message passing, update, and readout functions to process graph-structured data effectively. Experiments show that MPNNs outperform traditional methods, achieving chemical accuracy on 11 out of 13 targets in the QM9 dataset, and demonstrate scalability improvements through architectural variants like the towers approach.

Q & A Q / A