[20240513_LabSeminar_Huy]GraphFewShort_Transfer.pptx

thanhdowork 90 views 17 slides May 13, 2024
Slide 1
Slide 1 of 17
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17

About This Presentation

Spatio-Temporal Graph Few-Shot Learning with Cross-City Knowledge Transfer


Slide Content

Quang-Huy Tran Network Science Lab Dept. of Artificial Intelligence The Catholic University of Korea E-mail: [email protected] 2024-05-13 Spatio -Temporal Graph Few-Shot Learning with Cross- CityKnowledge Transfer Bin Lu Lin et al. SIGKDD’28: 2022 ACM SIGKDD Conference on Knowledge Discovery and Data Mining

OUTLINE MOTIVATION INTRODUCTION METHODOLOGY EXPERIMENT & RESULT CONCLUSION

MOTIVATION Spatio -temporal graph learning is a key method for urban computing tasks, such as traffic flow, taxi demand and air quality forecasting . However, due to the high cost of data collection, developing cities have few available data: infeasible to train a well-performed model. Overview Knowledge transfer, such as few-shot learning , made a progress in research. There are challenges from previous works: Transfer one single source: risk of negative transfer due to the great difference. Transfer multiple sources: no consider the varied feature differences across cities and within cities.

INTRODUCTION Goal: transfer the cross-city knowledge in graph-based few-shot learning scenarios. Research challenges: How to adapt feature extraction in target city via the knowledge from multiple source cities? How to alleviate the impacts of varied graph structure on transferring among different cities? Propose a novel and model-agnostic Spatio -Temporal Graph Few-Shot Learning framework( ST-GFSL ) . generate non-shared model parameters based on node-level meta knowledge to enhance specific feature extraction. reconstruct the graph structure of different cities based on meta-knowledge.

METHODOLOGY PROBLEM SETTING For ST graph forecasting tasks, our goal is to learn a function for approximating the true mapping of historical observed signals to the future signals:   Given a ST graph . set of nodes, is set of edges, is binary adjacency matrix, and is node features matrix.   For ST graph few-show learning, suppose we have P source of graph cities and target . After training on ,the model can leverage the meta knowledge from multiple source graphs and is tasked to predict on a disjoint target scenario, where only few-shot structured data of is available.  

METHODOLOGY Overall Architecture Spatio -Temporal Neural Network (STNN) Cross-city Knowledge Transfer

METHODOLOGY Spatio -Temporal Meta Knowledge Learner (ST-Meta Learner) Employ Gated Recurrent Unit (GRU). Utilize spatial-based graph attention network (GAT) to encode the spatial correlations. Meta knowledge:

METHODOLOGY ST-Meta Graph Reconstruction ST-Meta Graph is reconstructed by meta knowledge for structure aware learning. We predict the likelihood of an edge existing between nodes To guide the structure-aware learning of meta knowledge, we introduce graph reconstruction loss: ST-meta graph can be constructed  

METHODOLOGY Parameter Generation Linear layer: 2 linear transformation with reshape in center. Function that takes node-level meta knowledge as input and outputs the non-shared feature extractor parameters .   Obtain the non-shared parameters of feature extractors for different scenarios . Convolutional layer: 2 2D-Convolution with reshape in center.

METHODOLOGY ST-GFSL Learning Process Samples batches of task sets from source datasets. Each task belongs to one single city and is divided into support set , query set and . When learning a task, consider a joint loss:   Model-agnostic methods are an approach to understand the predictive response of a black box model, instead of the response from the original dataset. Two stage: base-model meta training and adaptation. Meta objective: minimize the sum of task loss on query sets

EXPERIMENT AND RESULT EXPERIMENT - BASELINES Measurement: Mean Absolute Error (MAE). Root Mean Square Error (RMSE). Dataset: METR-LA,PEMS-BAY, Didi-Chengdu, Didi-Shenzhen Traffic flow dataset. Baselines: HA : Historical Average, which formulates the traffic flow as a seasonal process, and uses average of previous seasons as the prediction. ARIMA : Auto-regressive integrated moving average is a well-known model that can understand and predict future values in a time series. Target-only : Directly training the model on few-shot data in target domain. Fine-tuned (Vanilla) : We first train the model on source datasets, and then fine-tune the model on few-shot data in target domain.

EXPERIMENT AND RESULT EXPERIMENT – BASELINE [1] Du, Y., Wang, J., Feng, W., Pan, S., Qin, T., Xu, R., & Wang, C. (2021, October). Adarnn : Adaptive learning and forecasting of time series. In Proceedings of the 30th ACM international conference on information & knowledge management (pp. 402-411). [2] Finn, C., Abbeel , P., & Levine, S. (2017, July). Model-agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning (pp. 1126-1135). PMLR. [3] Lea, C., Flynn, M. D., Vidal, R., Reiter, A., & Hager, G. D. (2017). Temporal convolutional networks for action segmentation and detection. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 156-165). [4] Yu, B., Yin, H., & Zhu, Z. (2017). Spatio -temporal graph convolutional networks: A deep learning framework for traffic forecasting. arXiv preprint arXiv:1709.04875. [5] Wu, Z., Pan, S., Long, G., Jiang, J., & Zhang, C. (2019). Graph wavenet for deep spatial-temporal graph modeling. arXiv preprint arXiv:1906.00121. Fine-tuned (ST-Meta) : Compared with “Fine-tuned (Vanilla)” method, we combine the proposed parameter generation based on meta knowledge to generate non-shared parameters for the model. AdaRNN [1]: A state-of-the-art transfer learning framework for non-stationary time series. MAML [2]: Model-Agnostic Meta Learning (MAML). Apply some advanced spatio -temporal data graph learning algorithms to our ST-GFSL framework: TCN [3]: 1D dilated convolution network-based temporal convolution network. STGCN [4]: Spatial temporal graph convolution network, which combines graph convolution with 1D convolution. GWN [5] : A convolution network structure combines graph convolution with dilated casual convolution and a self-adaptive graph.

EXPERIMENT AND RESULT RESULT – Overall Perfor mance

EXPERIMENT AND RESULT RESULT – Overall Perfor mance

CONCLUSION Propose a spatio -temporal graph few-shot learning framework called ST-GFSL for cross-city knowledge transfer. Non-shared feature extractor parameters based on node-level meta knowledge. improve the effectiveness of spatio -temporal representation on multiple datasets and transfer the cross-city knowledge via parameter matching from similar spatio -temporal meta knowledge. ST-GFSL integrates the graph reconstruction loss to achieve structure-aware learning. ST-GFSL not only apply to traffic speed prediction, but also apply to other few-shot scenarios: taxi demand prediction, indoor environment monitoring indifferent warehouses.