[20240805_LabSeminar_Huy]GPT-ST: Generative Pre-Training of Spatio-Temporal Graph Neural Networks.pptx

thanhdowork 79 views 18 slides Aug 06, 2024
Slide 1
Slide 1 of 18
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18

About This Presentation

GPT-ST: Generative Pre-Training of Spatio-Temporal Graph Neural Networks


Slide Content

Quang-Huy Tran Network Science Lab Dept. of Artificial Intelligence The Catholic University of Korea E-mail: [email protected] 2024-08-05 GPT-ST: Generative Pre-Training of Spatio -Temporal Graph Neural Networks Zhonghang Li et al. Neur I PS-2023 : 37th Conference on Neural Information Processing Systems

OUTLINE MOTIVATION METHODOLOGY EXPERIMENT & RESULT CONCLUSION

MOTIVATION Intelligent Transportation Systems (ITS) is a rapidly growing research field driven by advancements in sensors and communications technology : Spatio -Temporal (ST) prediction is main components for decision-making and risk mitigation . GNNs have gained attention for spatial and temporal prediction tasks, achieve remarkable results. Overview and Limitation Challenges: Lack of customized representations of specific spatio -temporal patterns. Time-dynamic and node-specific. Insufficient consideration of spatial dependencies at different levels. Previous works focus on pairwise associations between regions but overlook semantic relevance at different spatial level.

INTRODUCTION U tilizes a hierarchical hypergraph structure to capture spatial dependencies at different levels from a global perspective . Collaborate with adaptive mask, model acquires the capability to capture both intra- and inter-cluster spatial relations among regions, thereby generating robust spatio -temporal representations. Contribution P resent a novel pre-training Generative Pre- Training for Spatio -Temporal prediction ( GPT- ST) : S eamlessly integrates into existing spatio -temporal neural networks, resulting in performance enhancements . C ombines a model parameter customization scheme with a self-supervised masking autoencoding, enabling effective spatio -temporal pre-training .

METHODOLOGY Problem Definition Spatio-Temporal Data: three-way tensor R, T, F: numbers of regions, time slots, and feature dimensions. Spatio -Temporal Hypergraphs a region r in a specific time slot t. Hyperedges : reflect the multipartite region-wise relations connected by 1 hyperedge. Vertex-hyperedge connections : a learnable hypergraph.   Spatio-Temporal Pre-training: developing a pre-trained ST representation method Pre-training stage: adapt the masked autoencoding (MAE) task as the training objective Downstream task stage: representation generated by from historical data previous times before - th time slot and prediction for next times slot.   measure of prediction deviation Mask tensor predictive linear layer element-wise multiplication operation

METHODOLOGY Main Architecture a) Customized Temporal Pattern Encoding. b) Hierarchical Spatial Pattern Encoding. c) Cluster-aware Masking Mechanism.

METHODOLOGY Customized Temporal Pattern Encoding Initial Embedding Layer: initialize the representation of spatio -temporal data X . Normalize Z-score and mask operation. Linear transformation: where represent representation, mask operation, and normalized spatio-temporal data for r- th region in t- th time slot. d is hidden unit size. is learnable embedding vectors.   Temporal Hypergraph Neural Network: hypergraph neural network for the temporal pattern encoding on hyperedges .   Customized Parameter Learner: characterize the diversity of temporal patterns Time-specific parameters and the region-specific hypergraph parameter . .  

METHODOLOGY Hierarchical Spatial Pattern Encoding Hypergraph Capsule Clustering Network Traffic-level data augmentation and graph topology-level structure augmentation . Adaptively learn heterogeneity-aware region dependencies in terms of their traffic regularities. Iterative Hypergraph Structure Learning: dynamic routing mechanism j- th iteration: R elation scores and hyperedge representations are mutually adjusted by each other, to better reflect the semantic similarities between regions and spatial cluster centroids represented by hyperedges. C ombines two sets of weights (b + H) to get and generates final embeddings  

METHODOLOGY Hierarchical Spatial Pattern Encoding Cross-Cluster Relation Learning: Propose to model the inter-cluster relations via a high-level hypergraph neural network . Refine cluster embedding by message passing between cluster centroids and high-level hyperedges.   where denotes reshaped embedding matrix obtained from for denotes high-level hypergraph structure.   After refining, propagate the clustered embeddings back to the regional embeddings with the low-level hypergraph structure :

METHODOLOGY Cluster-aware Masking Mechanism Increase the masking proportion for certain categories to increase prediction difficulty for clusters. Completely mask signals of some clusters, facilitating ability of cross-cluster knowledge transfer for pre-trained model. Design a cluster-aware masking mechanism: Enhance intra- and inter-cluster relation learning. Adaptive maski ng strategy incorporates foregoing learned clustering information to develop an easy-to-hard masking process. At beginning of training, randomly mask a portion of regions for each cluster, where the masked values can be easily predicted by referring regions share similar regions.  

METHODOLOGY Algorithm - Process of the adaptive mask strategy

METHODOLOGY Algorithm - Process during the pre-training stage

EXPERIMENT AND RESULT EXPERIMENT SETTINGs Dataset: PEMS08, METR-LA, NYC Taxi, and NYC Citi Bike. Baselines: STGNN: STGCN[1], GWN[2], TGCN[3], MTGNN[4], MSDR[5], STMGCN[6], CCRNN[7], STSGCN[8], STFGNN[9], ASTGCN[10], STWA[11], and STGODE[12]. [1] Yu, B., Yin, H., & Zhu, Z. (2017). Spatio -temporal graph convolutional networks: A deep learning framework for traffic forecasting. arXiv preprint arXiv:1709.04875. [2] Wu, Z., Pan, S., Long, G., Jiang, J., & Zhang, C. (2019). Graph wavenet for deep spatial-temporal graph modeling. arXiv preprint arXiv:1906.00121. [3] Zhao, L., Song, Y., Zhang, C., Liu, Y., Wang, P., Lin, T., ... & Li, H. (2019). T-GCN: A temporal graph convolutional network for traffic prediction. IEEE transactions on intelligent transportation systems, 21(9), 3848-3858. [4] Wu, Z., Pan, S., Long, G., Jiang, J., Chang, X., & Zhang, C. (2020, August). Connecting the dots: Multivariate time series forecasting with graph neural networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 753-763). [5] Liu, D., Wang, J., Shang, S., & Han, P. (2022, August). Msdr : Multi-step dependency relation networks for spatial temporal forecasting. In Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining (pp. 1042-1050). [6] Geng , X., Li, Y., Wang, L., Zhang, L., Yang, Q., Ye, J., & Liu, Y. (2019, July). Spatiotemporal multi-graph convolution network for ride-hailing demand forecasting. In Proceedings of the AAAI conference on artificial intelligence (Vol. 33, No. 01, pp. 3656-3663). [7] Ye, J., Sun, L., Du, B., Fu, Y., & Xiong, H. (2021, May). Coupled layer-wise graph convolution for transportation demand prediction. In Proceedings of the AAAI conference on artificial intelligence (Vol. 35, No. 5, pp. 4617-4625). [8] Song, C., Lin, Y., Guo, S., & Wan, H. (2020, April). Spatial-temporal synchronous graph convolutional networks: A new framework for spatial-temporal network data forecasting. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 01, pp. 914-921). [9] Li, M., & Zhu, Z. (2021, May). Spatial-temporal fusion graph neural networks for traffic flow forecasting. In Proceedings of the AAAI conference on artificial intelligence (Vol. 35, No. 5, pp. 4189-4196). [10] Guo, S., Lin, Y., Feng, N., Song, C., & Wan, H. (2019, July). Attention based spatial-temporal graph convolutional networks for traffic flow forecasting. In Proceedings of the AAAI conference on artificial intelligence (Vol. 33, No. 01, pp. 922-929). [11] Cirstea, R. G., Yang, B., Guo, C., Kieu, T., & Pan, S. (2022, May). Towards spatio -temporal aware traffic time series forecasting. In 2022 IEEE 38th International Conference on Data Engineering (ICDE) (pp. 2900-2913). IEEE. [12] Fang, Z., Long, Q., Song, G., & Xie, K. (2021, August). Spatial-temporal graph ode networks for traffic flow forecasting. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining (pp. 364-373). Measurement : Mean absolute error (MAE), Root Mean Square Error (RMSE) and Mean Average Percentage Error (MAPE).

EXPERIMENT AND RESULT RESULT – Overall Perfor mance

EXPERIMENT AND RESULT RESULT – Visualization Fig. Hyperparameter study of total mask and adaptive mask

CONCLUSION I ntroduce a scalable and effective pre-training framework tailored for spatio -temporal prediction tasks : A pre-training model focused on capturing spatio -temporal dependencies . leverages a customized parameter learner and hierarchical hypergraph networks to extract customized spatio -temporal features and region-wise semantic associations, respectively. Propose an adaptive masking strategy : guides the model to learn inferential reasoning capabilities by considering intra-class and inter-class relations during the pre-training stage . Summarization