[20240805_LabSeminar_Huy]GPT-ST: Generative Pre-Training of Spatio-Temporal Graph Neural Networks.pptx

Quang-Huy Tran Network Science Lab Dept. of Artificial Intelligence The Catholic University of Korea E-mail: [email protected] 2024-08-05 GPT-ST: Generative Pre-Training of Spatio -Temporal Graph Neural Networks Zhonghang Li et al. Neur I PS-2023 : 37th Conference on Neural Information Processing Systems

OUTLINE MOTIVATION METHODOLOGY EXPERIMENT & RESULT CONCLUSION

MOTIVATION Intelligent Transportation Systems (ITS) is a rapidly growing research field driven by advancements in sensors and communications technology : Spatio -Temporal (ST) prediction is main components for decision-making and risk mitigation . GNNs have gained attention for spatial and temporal prediction tasks, achieve remarkable results. Overview and Limitation Challenges: Lack of customized representations of specific spatio -temporal patterns. Time-dynamic and node-specific. Insufficient consideration of spatial dependencies at different levels. Previous works focus on pairwise associations between regions but overlook semantic relevance at different spatial level.

INTRODUCTION U tilizes a hierarchical hypergraph structure to capture spatial dependencies at different levels from a global perspective . Collaborate with adaptive mask, model acquires the capability to capture both intra- and inter-cluster spatial relations among regions, thereby generating robust spatio -temporal representations. Contribution P resent a novel pre-training Generative Pre- Training for Spatio -Temporal prediction ( GPT- ST) : S eamlessly integrates into existing spatio -temporal neural networks, resulting in performance enhancements . C ombines a model parameter customization scheme with a self-supervised masking autoencoding, enabling effective spatio -temporal pre-training .

METHODOLOGY Problem Definition Spatio-Temporal Data: three-way tensor R, T, F: numbers of regions, time slots, and feature dimensions. Spatio -Temporal Hypergraphs a region r in a specific time slot t. Hyperedges : reflect the multipartite region-wise relations connected by 1 hyperedge. Vertex-hyperedge connections : a learnable hypergraph. Spatio-Temporal Pre-training: developing a pre-trained ST representation method Pre-training stage: adapt the masked autoencoding (MAE) task as the training objective Downstream task stage: representation generated by from historical data previous times before - th time slot and prediction for next times slot. measure of prediction deviation Mask tensor predictive linear layer element-wise multiplication operation

METHODOLOGY Main Architecture a) Customized Temporal Pattern Encoding. b) Hierarchical Spatial Pattern Encoding. c) Cluster-aware Masking Mechanism.

METHODOLOGY Customized Temporal Pattern Encoding Initial Embedding Layer: initialize the representation of spatio -temporal data X . Normalize Z-score and mask operation. Linear transformation: where represent representation, mask operation, and normalized spatio-temporal data for r- th region in t- th time slot. d is hidden unit size. is learnable embedding vectors. Temporal Hypergraph Neural Network: hypergraph neural network for the temporal pattern encoding on hyperedges . Customized Parameter Learner: characterize the diversity of temporal patterns Time-specific parameters and the region-specific hypergraph parameter . .

METHODOLOGY Hierarchical Spatial Pattern Encoding Hypergraph Capsule Clustering Network Traffic-level data augmentation and graph topology-level structure augmentation . Adaptively learn heterogeneity-aware region dependencies in terms of their traffic regularities. Iterative Hypergraph Structure Learning: dynamic routing mechanism j- th iteration: R elation scores and hyperedge representations are mutually adjusted by each other, to better reflect the semantic similarities between regions and spatial cluster centroids represented by hyperedges. C ombines two sets of weights (b + H) to get and generates final embeddings

METHODOLOGY Hierarchical Spatial Pattern Encoding Cross-Cluster Relation Learning: Propose to model the inter-cluster relations via a high-level hypergraph neural network . Refine cluster embedding by message passing between cluster centroids and high-level hyperedges. where denotes reshaped embedding matrix obtained from for denotes high-level hypergraph structure. After refining, propagate the clustered embeddings back to the regional embeddings with the low-level hypergraph structure :

METHODOLOGY Cluster-aware Masking Mechanism Increase the masking proportion for certain categories to increase prediction difficulty for clusters. Completely mask signals of some clusters, facilitating ability of cross-cluster knowledge transfer for pre-trained model. Design a cluster-aware masking mechanism: Enhance intra- and inter-cluster relation learning. Adaptive maski ng strategy incorporates foregoing learned clustering information to develop an easy-to-hard masking process. At beginning of training, randomly mask a portion of regions for each cluster, where the masked values can be easily predicted by referring regions share similar regions.

METHODOLOGY Algorithm - Process of the adaptive mask strategy

METHODOLOGY Algorithm - Process during the pre-training stage

EXPERIMENT AND RESULT EXPERIMENT SETTINGs Dataset: PEMS08, METR-LA, NYC Taxi, and NYC Citi Bike. Baselines: STGNN: STGCN[1], GWN[2], TGCN[3], MTGNN[4], MSDR[5], STMGCN[6], CCRNN[7], STSGCN[8], STFGNN[9], ASTGCN[10], STWA[11], and STGODE[12]. [1] Yu, B., Yin, H., & Zhu, Z. (2017). Spatio -temporal graph convolutional networks: A deep learning framework for traffic forecasting. arXiv preprint arXiv:1709.04875. [2] Wu, Z., Pan, S., Long, G., Jiang, J., & Zhang, C. (2019). Graph wavenet for deep spatial-temporal graph modeling. arXiv preprint arXiv:1906.00121. [3] Zhao, L., Song, Y., Zhang, C., Liu, Y., Wang, P., Lin, T., ... & Li, H. (2019). T-GCN: A temporal graph convolutional network for traffic prediction. IEEE transactions on intelligent transportation systems, 21(9), 3848-3858. [4] Wu, Z., Pan, S., Long, G., Jiang, J., Chang, X., & Zhang, C. (2020, August). Connecting the dots: Multivariate time series forecasting with graph neural networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 753-763). [5] Liu, D., Wang, J., Shang, S., & Han, P. (2022, August). Msdr : Multi-step dependency relation networks for spatial temporal forecasting. In Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining (pp. 1042-1050). [6] Geng , X., Li, Y., Wang, L., Zhang, L., Yang, Q., Ye, J., & Liu, Y. (2019, July). Spatiotemporal multi-graph convolution network for ride-hailing demand forecasting. In Proceedings of the AAAI conference on artificial intelligence (Vol. 33, No. 01, pp. 3656-3663). [7] Ye, J., Sun, L., Du, B., Fu, Y., & Xiong, H. (2021, May). Coupled layer-wise graph convolution for transportation demand prediction. In Proceedings of the AAAI conference on artificial intelligence (Vol. 35, No. 5, pp. 4617-4625). [8] Song, C., Lin, Y., Guo, S., & Wan, H. (2020, April). Spatial-temporal synchronous graph convolutional networks: A new framework for spatial-temporal network data forecasting. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 01, pp. 914-921). [9] Li, M., & Zhu, Z. (2021, May). Spatial-temporal fusion graph neural networks for traffic flow forecasting. In Proceedings of the AAAI conference on artificial intelligence (Vol. 35, No. 5, pp. 4189-4196). [10] Guo, S., Lin, Y., Feng, N., Song, C., & Wan, H. (2019, July). Attention based spatial-temporal graph convolutional networks for traffic flow forecasting. In Proceedings of the AAAI conference on artificial intelligence (Vol. 33, No. 01, pp. 922-929). [11] Cirstea, R. G., Yang, B., Guo, C., Kieu, T., & Pan, S. (2022, May). Towards spatio -temporal aware traffic time series forecasting. In 2022 IEEE 38th International Conference on Data Engineering (ICDE) (pp. 2900-2913). IEEE. [12] Fang, Z., Long, Q., Song, G., & Xie, K. (2021, August). Spatial-temporal graph ode networks for traffic flow forecasting. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining (pp. 364-373). Measurement : Mean absolute error (MAE), Root Mean Square Error (RMSE) and Mean Average Percentage Error (MAPE).

EXPERIMENT AND RESULT RESULT – Overall Perfor mance

EXPERIMENT AND RESULT RESULT – Visualization Fig. Hyperparameter study of total mask and adaptive mask

CONCLUSION I ntroduce a scalable and effective pre-training framework tailored for spatio -temporal prediction tasks : A pre-training model focused on capturing spatio -temporal dependencies . leverages a customized parameter learner and hierarchical hypergraph networks to extract customized spatio -temporal features and region-wise semantic associations, respectively. Propose an adaptive masking strategy : guides the model to learn inferential reasoning capabilities by considering intra-class and inter-class relations during the pre-training stage . Summarization

[20240805_LabSeminar_Huy]GPT-ST: Generative Pre-Training of Spatio-Temporal Graph Neural Networks.pptx

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

[20240805_LabSeminar_Huy]GPT-ST: Generative Pre-Training of Spatio-Temporal Graph Neural Networks.pptx

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx