[NS][Lab_Seminar_240909]Sparse Multi-Relational Graph Convolutional Network for Multi-type Object Trajectory Prediction.pptx

thanhdowork 60 views 24 slides Sep 09, 2024
Slide 1
Slide 1 of 24
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24

About This Presentation

Sparse Multi-Relational Graph Convolutional Network for Multi-type Object Trajectory Prediction


Slide Content

Sparse Multi-Relational Graph Convolutional Network for Multi-type Object Trajectory Prediction Tien-Bach-Thanh Do Network Science Lab Dept. of Artificial Intelligence The Catholic University of Korea E-mail: os fa19730 @catholic.ac.kr 202 4/09/09 Jianhui Zhang et al. IJCAI 2024

Introduction Object trajectory prediction: Critical task in video surveillance, autonomous driving, and smart cities Challenges arise from interactions among multiple types of objects (pedestrians, cars, bicycles) Problem: Traditional approaches focus mainly on pedestrian interactions, ignoring multi-type object dynamics Solution: Propose a novel method, Sparse Multi-Relational Graph Convolutional Network (SMGCN), for handling multi-type object interactions in a sparse and dynamic environment

Motivation & Challenges Existing gaps: Most models focus on pedestrian trajectories only Complex real-world scenarios involve different object types with varying speeds and interactions Challenges: Handling multiple models with distinct motion behaviors Avoiding unnecessary, superfluous interaction between objects

Key Contributions Sparse Multi-Relational Graph Convolutional Network (SMGCN): Capture multi-type object interactions Adaptive Sparsification: Dynamically prunes unnecessary connections between objects, improving efficiency and accuracy Global Temporal Aggregation (GTA): Mitigates error accumulation in long-term trajectory prediction by leveraging multi-round updates

Model SMGCN Figure 3: Sparse multi-relational graph learning. It consists of two procedures including the sparse multi-relational spatial and temporal graph learning. The multi-scale division divides the objects into several groups, and the fusion procedure generates two sequences of matrices, based on which the spatial self-attention generates the scores for the spatial interaction among objects and yields a interaction fusion matrix by 1 × 1 convolution. The temporal self-attention generates the scores for the temporal interaction among object trajectories

Model SMGCN Sparse Multi-Relational Graph: Models both inter-type and intra-type object interactions using spatial and temporal graphs Key Components: Multi-relational spatial graph based on position, label, and distance Temporal graph capturing object trajectories over time Adaptive Sparsification: Computes threshold dynamically to reduce graph density, ensuring only relevant interactions are considered

Model Multi-scale Division Divide object into ground based on distance and relative displacement metrics Each sub-graph is represented in adjacency matrix

Model Fusion Interaction Score After multi-scale division, each group’s interaction scores are fused with position and label information to calculate the adjacency matrices for both distance and relative displacement

Model Self-Attention Mechanism To score interactions between object, self-attention mechanism is applied

Model Interaction Score Fusion Interaction score matrices from distance and displacement metrics are fused

Model Temporal Interaction Temporal interactions between object trajectories are modeled similarly to spatial interactions. The position encoding tensor is introduced to account for object types in the temporal domain

Model Sparse Interaction Matrix Densely connected graphs are pruned using an adaptive sparsification technique. The sparsification threshold 𝜂 is computed dynamically:

Model Graph Convolution Networks (GCN) and Temporal Convolution Networks (TCN) GCN is applied to the spatial graph to extract features from the sparse multi-relational spatial graph 𝐴s TCN is applied to the temporal graph to learn time-dependent features

Model Global Temporal Aggregation (GTA) Issue in Trajectory Prediction: Error accumulation over time as predictions go further into the future Solution : GTA mechanism integrates global temporal information after each prediction round, improving long-term accuracy Example: Cars and bicycles take larger turns compared to pedestrians, so different GTA layers adjust their trajectories accordingly

Model Global Temporal Aggregation (GTA)

Model Learning Objective Predicted object trajectory is modeled as a bi-variate Gaussian distribution

Experiments Experimental Settings Datasets: ETH: real-world pedestrian movements UCY: university campus pedestrian data SDD (Standford Drone Dataset): cover pedestrians, cyclists, cars, and more in an urban setting Metrics: Average Displacement Error (ADE): measure average error over all predicted points Final Displacement Error (FDE): measure the error at the final predicted point

Experiments Single-type trajectory prediction Table 1: Comparison with state-of-the-art methods on the ETH and UCY dataset under the metrics of ADE and FDE. We observe that our SMGCN demonstrates the best average error scores in both ADE / FDE metrics (lower is better)

Experiments Single-type trajectory prediction Figure 4: Visualization of trajectory prediction. Based on past trajectory, our model predicts future paths. Results show close alignment with actual trajectories

Experiments Ablation Study - Effect of adaptive sparse threshold Table 3: Impact of different sparse threshold η. Results are reported in the forms of ADE and FDE. The adaptive η is calculated from the learnable sparse multi-relational matrixt

Experiments Ablation Study - Effect of multi-relation spatial division Table 4: Impact of each component of SMGCN in ADE. The ultimate SMGCN seamlessly incorporates both relative displacement and distance relations while leveraging a multi-layer GTA

Experiments Ablation Study - Effect of multi-relation spatial division Table 5: Impact of each component of SMGCN in FDE. The ultimate SMGCN seamlessly incorporates both relative displacement and distance relations while leveraging a multi-layer GTA

Experiments Ablation Study - Ablation of the Queries Pool Size for GQS Table 7: The ablation results of different queries candidate pool size of λ for GQS. The backbone model is Strip-MLP-SWMP

Conclusion SMGCN offers a novel, adaptive approach to trajectory prediction, particularly suited for multi-type object interactions Adaptive sparsification and Global Temporal Aggregation improve prediction accuracy and reduce error accumulation over time Future work: Extend model to 3D environments and incorporate more complex object types