[NS][Lab_Seminar_240909]Sparse Multi-Relational Graph Convolutional Network for Multi-type Object Trajectory Prediction.pptx
thanhdowork
60 views
24 slides
Sep 09, 2024
Slide 1 of 24
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
About This Presentation
Sparse Multi-Relational Graph Convolutional Network for Multi-type Object Trajectory Prediction
Size: 2.04 MB
Language: en
Added: Sep 09, 2024
Slides: 24 pages
Slide Content
Sparse Multi-Relational Graph Convolutional Network for Multi-type Object Trajectory Prediction Tien-Bach-Thanh Do Network Science Lab Dept. of Artificial Intelligence The Catholic University of Korea E-mail: os fa19730 @catholic.ac.kr 202 4/09/09 Jianhui Zhang et al. IJCAI 2024
Introduction Object trajectory prediction: Critical task in video surveillance, autonomous driving, and smart cities Challenges arise from interactions among multiple types of objects (pedestrians, cars, bicycles) Problem: Traditional approaches focus mainly on pedestrian interactions, ignoring multi-type object dynamics Solution: Propose a novel method, Sparse Multi-Relational Graph Convolutional Network (SMGCN), for handling multi-type object interactions in a sparse and dynamic environment
Motivation & Challenges Existing gaps: Most models focus on pedestrian trajectories only Complex real-world scenarios involve different object types with varying speeds and interactions Challenges: Handling multiple models with distinct motion behaviors Avoiding unnecessary, superfluous interaction between objects
Key Contributions Sparse Multi-Relational Graph Convolutional Network (SMGCN): Capture multi-type object interactions Adaptive Sparsification: Dynamically prunes unnecessary connections between objects, improving efficiency and accuracy Global Temporal Aggregation (GTA): Mitigates error accumulation in long-term trajectory prediction by leveraging multi-round updates
Model SMGCN Figure 3: Sparse multi-relational graph learning. It consists of two procedures including the sparse multi-relational spatial and temporal graph learning. The multi-scale division divides the objects into several groups, and the fusion procedure generates two sequences of matrices, based on which the spatial self-attention generates the scores for the spatial interaction among objects and yields a interaction fusion matrix by 1 × 1 convolution. The temporal self-attention generates the scores for the temporal interaction among object trajectories
Model SMGCN Sparse Multi-Relational Graph: Models both inter-type and intra-type object interactions using spatial and temporal graphs Key Components: Multi-relational spatial graph based on position, label, and distance Temporal graph capturing object trajectories over time Adaptive Sparsification: Computes threshold dynamically to reduce graph density, ensuring only relevant interactions are considered
Model Multi-scale Division Divide object into ground based on distance and relative displacement metrics Each sub-graph is represented in adjacency matrix
Model Fusion Interaction Score After multi-scale division, each group’s interaction scores are fused with position and label information to calculate the adjacency matrices for both distance and relative displacement
Model Self-Attention Mechanism To score interactions between object, self-attention mechanism is applied
Model Interaction Score Fusion Interaction score matrices from distance and displacement metrics are fused
Model Temporal Interaction Temporal interactions between object trajectories are modeled similarly to spatial interactions. The position encoding tensor is introduced to account for object types in the temporal domain
Model Sparse Interaction Matrix Densely connected graphs are pruned using an adaptive sparsification technique. The sparsification threshold 𝜂 is computed dynamically:
Model Graph Convolution Networks (GCN) and Temporal Convolution Networks (TCN) GCN is applied to the spatial graph to extract features from the sparse multi-relational spatial graph 𝐴s TCN is applied to the temporal graph to learn time-dependent features
Model Global Temporal Aggregation (GTA) Issue in Trajectory Prediction: Error accumulation over time as predictions go further into the future Solution : GTA mechanism integrates global temporal information after each prediction round, improving long-term accuracy Example: Cars and bicycles take larger turns compared to pedestrians, so different GTA layers adjust their trajectories accordingly
Model Global Temporal Aggregation (GTA)
Model Learning Objective Predicted object trajectory is modeled as a bi-variate Gaussian distribution
Experiments Experimental Settings Datasets: ETH: real-world pedestrian movements UCY: university campus pedestrian data SDD (Standford Drone Dataset): cover pedestrians, cyclists, cars, and more in an urban setting Metrics: Average Displacement Error (ADE): measure average error over all predicted points Final Displacement Error (FDE): measure the error at the final predicted point
Experiments Single-type trajectory prediction Table 1: Comparison with state-of-the-art methods on the ETH and UCY dataset under the metrics of ADE and FDE. We observe that our SMGCN demonstrates the best average error scores in both ADE / FDE metrics (lower is better)
Experiments Single-type trajectory prediction Figure 4: Visualization of trajectory prediction. Based on past trajectory, our model predicts future paths. Results show close alignment with actual trajectories
Experiments Ablation Study - Effect of adaptive sparse threshold Table 3: Impact of different sparse threshold η. Results are reported in the forms of ADE and FDE. The adaptive η is calculated from the learnable sparse multi-relational matrixt
Experiments Ablation Study - Effect of multi-relation spatial division Table 4: Impact of each component of SMGCN in ADE. The ultimate SMGCN seamlessly incorporates both relative displacement and distance relations while leveraging a multi-layer GTA
Experiments Ablation Study - Effect of multi-relation spatial division Table 5: Impact of each component of SMGCN in FDE. The ultimate SMGCN seamlessly incorporates both relative displacement and distance relations while leveraging a multi-layer GTA
Experiments Ablation Study - Ablation of the Queries Pool Size for GQS Table 7: The ablation results of different queries candidate pool size of λ for GQS. The backbone model is Strip-MLP-SWMP
Conclusion SMGCN offers a novel, adaptive approach to trajectory prediction, particularly suited for multi-type object interactions Adaptive sparsification and Global Temporal Aggregation improve prediction accuracy and reduce error accumulation over time Future work: Extend model to 3D environments and incorporate more complex object types