Reliable World Models: Physical Grounding and Safety Prediction
ivanruchkin
0 views
29 slides
Oct 21, 2025
Slide 1 of 29
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
About This Presentation
Presented by Ivan Ruchkin at the IROS 2025 "Building Safe Robots:A Holistic Integrated View on Safety from Modelling, Control & Implementation" Workshop on October 20, 2025.
Abstract:
Autonomous neural controllers pose critical safety challenges in modern robots. To support interve...
Presented by Ivan Ruchkin at the IROS 2025 "Building Safe Robots:A Holistic Integrated View on Safety from Modelling, Control & Implementation" Workshop on October 20, 2025.
Abstract:
Autonomous neural controllers pose critical safety challenges in modern robots. To support intervention and adaptation, robots should predict the safety of their future trajectories. However, it is difficult to do for long horizons, especially for rich sensor data under partial observability and distribution shift. The first part of this talk presents a family of deep-learning pipelines for calibrated safety prediction in end-to-end vision-controlled systems. Inspired by world models from reinforcement learning, our pipelines build upon variational autoencoders and recurrent predictors to forecast latent trajectories from raw image sequences. To overcome distribution shift due to compounding prediction errors and changing environmental conditions, we incorporate unsupervised domain adaptation.
Unfortunately, learned latent representations in world models lack direct mapping to meaningful physical quantities, limiting their utility and interpretability in downstream planning, control, and safety verification. The second part of this talk argues for a fundamental shift from physically informed to physically interpretable world models. We crystallize four principles that leverage symbolic physical knowledge for interpretable world models: (1) structuring latent spaces, (2) aligning with invariances/equivariances, (3) exploiting supervision with varied strength and granularity, and (4) partitioning generative outputs.
The final, third part of this talk dives into an interpretable world model for trajectory prediction. We discuss a novel architecture that aligns learned latent representations with real-world physical quantities by combining a physically interpretable image autoencoding model and a partially known dynamical model. To eliminate the impractical reliance on precise ground-truth physical knowledge, the training incorporates weak distributional supervision. Through three case studies, we demonstrate that this architecture not only provides physical interpretability but also achieves superior state prediction accuracy.
Size: 4.78 MB
Language: en
Added: Oct 21, 2025
Slides: 29 pages
Slide Content
Ivan Ruchkin
Assistant Professor
Electrical & Computer Engineering
University of Florida
Building Safe Robots Workshop
IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS)
Hangzhou, China
October 20, 2025
Reliable World Models:
Physical Grounding and Safety Prediction
1
Subscribe to us
on LinkedIn!
—TEA Lab’s mission: make autonomous systems safer and trustworthier through
rigorous engineering methods with strong guarantees
○Across many domains: autonomous driving, mobile robots, UAVs/UUVs
—Research areas: formal methods, AI/ML, cyber-physical systems, robotics
○Theory → algorithms & tools → simulations+physical deployment
TEA Lab: Trustworthy ∃ngineered ∀utonomy
tea.ece.ufl.edu
2
Research platform:
small-scale racing cars
ivan.ece.ufl.edu [email protected]fl.edu
Subscribe to us
on RedNote!
—Industry and government roles
○AFRL VFRP, NASA JPL, CMU SEI, Google Summer of Code, …
—2011–2018: PhD @ Carnegie Mellon University
○Pittsburgh, PA — software engineering & formal methods
—2006–2011: Undergrad @ Moscow State University
○Moscow, Russia — applied math & computer science
Ivan Ruchkin: brief bio
3
—2018–2022: Postdoc @ University of Pennsylvania
○Philadelphia, PA — guarantees for learning-enabled systems
—2022–present: Assistant professor @ University of Florida
○Gainesville, PA — trustworthy methods for autonomy
What is a World Model?
Observation x
t
Real-world
dynamics
Physical
state
Sensing
●Predictor of future observations from an initial observation and control actions
○Learned latent representation of the environment state
○Temporal model of the environment dynamics
World model
Physical
state
Observation x̂
t+1
Control policy
learning
Existing
controller
Action
➔Part 1: World Models for Safety Prediction
➔Part 2: Principles for Physically Interpretable World Models
➔Part 3: Physical Interpretability under Weak Supervision?
Agenda
5
➔Part 1: World Models for Safety Prediction
➔Part 2: Principles for Physically Interpretable World Models
➔Part 3: Physical Interpretability under Weak Supervision?
Agenda
6
Part 1: How Safe Will I Be Given What I Saw?
Problem:
Given a system that contains state (x), observations (y), and
a safety property φ (x):
-Is the system safe now?
φ (x
i
| y
i
)
-Will the system be safe after k steps?
φ ( x
i+k
| y
i
)
-What’s the probability of safety after k steps?
P( φ (x
i+k
) | y
i
)
Challenges:
-Unknown states and dynamical models
-High-dimensional (1000+) observations
-Distribution shift in predictions
-Difficult to reliably quantify confidence
Contributions:
-Modular family of online safety predictors
-Conformal confidence calibration
-Three benchmarks: racing car, cart pole, and donkey car
Predictor
Monolithic
predictor
Latent monolithic
predictor
Latent
forecaster
Part 1: How Safe Will I Be Given What I Saw?
Safety interval prediction:
1.Train the predictors to forecast future trajectories & estimate safety probabilities
2.Robustify to distribution shift with unsupervised domain adaptation (UDA)
3.Compute statistically valid intervals for safety probability using conformal calibration
C. Expected Calibration Error (ECE):
Calibration improves all architectures
Part 1: Results on DonkeyCar
Prediction Horizons
1.0
0.9
0.8
0.7
0.6
0 20 40 60 80 100
1.0
0.9
0.8
0.7
0.6
A. F1 Score: Monolithic (left) lose to Composite (right)
0 20 40 60 80 100
B. F1 Score: Attention-free (left) lose to Attention-based (right)
Results I
Prediction Horizons
0 20 40 60 80 1000 20 40 60 80 100
See for details:
Z. Mao, C. Sobolewski, and I. Ruchkin. “How Safe Am I
Given What I See? Calibrated Prediction of Safety Chances
for Image- Controlled Autonomy”, in Proc. of L4DC, 2024
Z. Mao, M. E. Umasudhan, and I. Ruchkin. "How Safe Will I
Be Given What I Saw? Calibrated Prediction of Safety
Chances for Image-Controlled Autonomy", arXiv preprint
2508.09346, 2025
1. Lack of Physical Interpretability
Latent representations are black boxes: we do not know the meaning of embeddings
2. Impossible Hallucinations
Predictors may hallucinate physically impossible states
3. Incompatible with Symbolic Techniques
Cannot perform state-based planning or formal verification
4. Weak Safety Guarantees
Hard to translate the guarantees from the latent space to the physical world
Limitations of Existing World Models
Observation x
t
Real-world
dynamics
Physical
state
Sensing
World model
Physical
state
Observation x̂
t+1
Control policy
learning
Existing
controller
Action
➔Part 1: World Models for Safety Prediction
➔Part 2: Principles for Physically Interpretable World Models
➔Part 3: Physical Interpretability under Weak Supervision?
Agenda
11
Towards Physically Interpretable World Models
b.Latent dynamics f simulate
physical processes
Real
dynamics
f
phys
Physical
variables
Latent
states
Learned
dynamics
f
z
t
z
t
phys
z
t+1
z
t+1
phys
a.Latent embeddings z correspond
to physical properties
12
Four Principles for Physical Interpretability
●Principle 1: Functionally organize the latent space
Observation x
t
Real-world
dynamics
Physical
state
Sensing
Physical
state
Observation x̂
t+1
Trajectory
prediction
Formal
verification
Control policy
learning
Existing
controller
Action
Physical
encoder
z
t
1
…
z
t
n
z
t
1
…
z
t
n
Interaction
encoder
z
t
1
…
z
t
n
Style
encoder
13
Principle 1: Functional Latent Organization
●Decouple the latent vector through separate encoders:
○Lets us use function-specific prediction mechanisms downstream
○Improves debuggability (style hallucinations → look at style branch)
●Suppose we aim to decouple latents that encode:
○Physical state of the ego (enc
1
(x))
○Relative states between egos (enc
2
(x))
○Residual, stylistic feature (enc
3
(x))
●The modular latent vectors becomes:
○z = [enc
1
(x) enc
2
(x) enc
3
(x)]
T
Observation x
t
Physical
encoder
z
t
1
…
z
t
n
z
t
1
…
z
t
n
Interaction
encoder
z
t
1
…
z
t
n
Style
encoder
14
Four Principles for Physical Interpretability
Observation x
t
Real-world
dynamics
Physical
state
Sensing
●Principle 2: Learn invariant and equivariant representations
Physical
state
Observation x̂
t+1
Formal
verification
Control policy
learning
Trajectory
prediction
Existing
controller
Action
Physical
encoder
z
t
1
…
z
t
n
z
t
1
…
z
t
n
Interaction
encoder
z
t
1
…
z
t
n
Physical
dynamics
Latent
predictor
Agent
“forces”
z
t+1
1
…
z
t+1
n
z
t+1
1
…
z
t+1
n
z
t+1
1
…
z
t+1
n
Style
encoder
15
Principle 2: I/O equivariance in encoders
●Observation transformations respect the branch’s intent, e.g.:
○Changes in image brightness should not affect the physical latents
○Changes in image brightness should affect the stylistic latents
●Enforce these I/O relationships through selective sampling:
○E.g., train on samples with varying image brightness but fixed state
●Let:
○g : X → X be an observation transformation (e.g., increase brightness)
parameterized by θ~Θ
○h : Z → Z be the respective latent transformation parameterized by φ~Φ
●Compute loss as:
16
Four Principles for Physical Interpretability
Observation x
t
Real-world
dynamics
Physical
state
Sensing
●Principle 3: Use multi-level and multi-strength supervision signals
Physical
state
Observation x̂
t+1
Trajectory
prediction
Formal
verification
Control policy
learning
Existing
controller
Action
Physical
encoder
z
t
1
…
z
t
n
z
t
1
…
z
t
n
Interaction
encoder
z
t
1
…
z
t
n
Physical
dynamics
Latent
predictor
Agent
“forces”
z
t+1
1
…
z
t+1
n
z
t+1
1
…
z
t+1
n
z
t+1
1
…
z
t+1
n
Style
encoder
17
Principle 3: Multi-level/strength Supervision
18
●Employ multi-level/strength supervision signals in end-to-end training
●Types of supervision to encourage physically interpretable latents:
○Strong: learn a physically meaningful latent space from labeled data
○Semi: refine the latent space with partially labeled data
○Weak: use noisy/coarse labels to enforce trajectory smoothness
○Self: generate a supervision signal from unlabeled data (contrastive loss)
●Combine supervision levels:
○Determine available training information (trajectories, state constraints)
○Engineer a superimposed training loss that enforces physical grounding
Four Principles for Physical Interpretability
Observation x
t
Real-world
dynamics
Physical
state
Sensing
●Principle 4: Partition generated observations
Physical
state
Observation x̂
t+1
Trajectory
prediction
Formal
verification
Control policy
learning
Existing
controller
Action
Physical
encoder
z
t
1
…
z
t
n
z
t
1
…
z
t
n
Interaction
encoder
z
t
1
…
z
t
n
Physical
dynamics
Latent
predictor
Agent
“forces”
z
t+1
1
…
z
t+1
n
z
t+1
1
…
z
t+1
n
z
t+1
1
…
z
t+1
n
Style
encoder
Object
decoder
Style
decoder
Principle 4: partitioning of
observations
19
Principle 4: Output Space Partitioning
20
●Segment the generated observations through a set of generators
●Typically, world models decode the predicted latent into a single observation
○Verifying the correctness of a monolithic decoder is challenging due to
the high-dimensional space of raw observations
●For scalable verification: design a generator that relates the physical latents to
a low-dimensional, physically meaningful segment of the observation
○E.g., velocity/position latents reconstruct the cart body in CartPole
Observation x
t
Real-world
dynamics
Physical
state
Sensing
Physical
state
Observation x̂
t+1
Trajectory
prediction
Formal
verification
Control policy
learning
Existing
controller
Action
Physical
encoder
z
t
1
…
z
t
n
z
t
1
…
z
t
n
Interaction
encoder
z
t
1
…
z
t
n
Physical
dynamics
Latent
predictor
Agent
“forces”
z
t+1
1
…
z
t+1
n
z
t+1
1
…
z
t+1
n
z
t+1
1
…
z
t+1
n
Style
encoder
Object
decoder
Style
decoder
Principle 4: partitioning of
observations
Principle 3: multi-level,
multi-strength supervision
Principle 2:
in/equivariance
Principle 1: latent
structuring
Summarizing the Four Principles
21
See for details: J. Peper*, Z. Mao*, Y. Geng, S. Pan, and I. Ruchkin. “Four Principles for Physically Interpretable World Models”,
In Proc. of NeuS 2025. * Co-first authors.
➔Part 1: World Models for Safety Prediction
➔Part 2: Principles for Physically Interpretable World Models
➔Part 3: Physical Interpretability under Weak Supervision?
Agenda
22
Physical Grounding under Weak Supevision?
23
Given a past observation sequence y
i-m:i
, can a predictor P
predict not only the future sequence y
i+1:i+k
,
but also
physically interpretable state representations z
i+1:i+k
?
Problem 2: Observation + interpretable state prediction
Can an autoencoder W encode observation y into a physically
interpretable representation z, given weak distributional
supervision in x?
Problem 1: Physically meaningful auto-encoding
Given observation y, can an autoencoder W
encode it into a physical interpretable
representation z given a weak supervision x.
Given past observation sequence y
i-m:i
, can
a predictor P predict the future sequence y
i+1:i+k
meanwhile can physically interpret the
predictions into meaningful states.
Problem 3: Parameter learning of dynamics
Given an interpretable sequence z
i-m:i
, can we learn the
parameters ψ of an unknown dynamics model ϕ?
Taxonomy of mappings between spaces:
x
State Space
y
Observation
Space
z
Interpretable
Latent
Space
z
*
Intermediate
Latent
Space
Intrinsic Autoencoding (one stage)
Extrinsic Autoencoding (two stages)
Unknown Mapping
Our Goal
Related Work on Physically Interpretable WMs
24
Given observation y, can an autoencoder W
encode it into a physical interpretable
representation z given a weak supervision x.
Given past observation sequence y
i-m:i
, can
a predictor P predict the future sequence y
i+1:i+k
meanwhile can physically interpret the
predictions into meaningful states.
DVBF (Deep Variational Bayes Filter) – Karl et al., 2016
Type: Extrinsic, continuous latent world model.
Method: Extends VAE with latent stochastic dynamics.
Limitation: Lacks explicit physical grounding; needs strong
supervision or access to full state variables.
Our Delta: PIWM replaces purely stochastic transitions with
partially known physical dynamics (ϕ) and aligns latent space to
physical quantities via weak distributional supervision, improving
interpretability and parameter recovery
SINDyC (Sparse Identification of Nonlinear Dynamics
with Control) – Brunton et al., 2016
Type: Classical dynamics discovery method.
Method: Learns explicit dynamics from state trajectories.
Limitation: Requires true state access, not feasible from
high-dimensional images.
Our Delta: PIWM operates directly from visual data and
recovers equivalent interpretable dynamics from weakly
supervised image sequences
Vid2Param – Asenov et al., 2019
Type: Intrinsic, continuous.
Method: Learns to regress physical parameters
directly from videos.
Limitation: Requires full supervision of physical
quantities, cannot perform multi-step prediction.
Our Delta: PIWM achieves parameter inference
and multi-step physical trajectory prediction using
only weak distributional supervision, removing
dependence on exact ground-truth physical labels
GOKU-Net – Linial et al., 2021
Type: Intrinsic, physics-aware.
Method: Combines generative ODEs with
known physical structures to infer unobserved
variables.
Limitation: Limited to known ODE systems
and supervised training on structured data.
Our Delta: PIWM generalizes this idea to
high-dimensional image observations and
unknown dynamics with partial physical priors
and weak supervision to guide latents.
Key Ideas for PIWM under Weak Supervision
25
Observationsy
t-m:t
Encoder
E
Static representation
like position, angle
Partially known
Dynamics Ф
(with learnable
parameter Ψ)
…
Dynamic representation
like velocity
Decoder
D
Controller
h
a
action
Observationsy
t+1
Latent z
*
t-m:t
Latent z
’
t+1
^
^
Physically Interpretable Latents: Learn latents directly aligned with real-world physical quantities, rather than abstract image features.
Weak Distributional Supervision: Replace exact physical labels with coarse distribution-based supervision to make training realistic.
Physics Knowledge Integration: Embed known physical equations into the model while learning only unknown parameters.
Extrinsic Two-Stage Design: Decouple visual perception and physical reasoning by first encoding images, then mapping to physical states.
Quantized Latent Regularization: Use discrete latent spaces to impose structure and stabilize learning under noisy supervision.
Unified Predictive Framework: Combine interpretable autoencoding and learnable dynamics for long-horizon trajectory prediction.
PIWM Results: Qualitative
26
Physically grounded > pixel-accurate:
Physically grounded models outperform purely visual baselines in both prediction and explanation quality.
PIWM Results: Quantitative
27
Quantized & extrinsic models win:
-Discrete extrinsic PIWM achieves the lowest prediction error
and most stable long-horizon performance.
-Separating perception from physical reasoning yields stronger
generalization and noise robustness.
-Quantization act as an effective structural prior, improving
consistency under noisy conditions.
Interpretability without precise supervision:
-PIWM recovers true physical parameters using
only weak supervision.
Intrinsic models remain competitive: Continuous intrinsic PIWM performs
surprisingly well, showing potential for efficient end-to-end training.
See for details: Z. Mao, M. E. Umasudhan, and I. Ruchkin.
“Can Weak Quantization Make World Models Physically
Interpretable?”, In submission to ICLR 2026.
-More complex physical systems
-Multimodality with correlated errors
-More principles: sparsity, causality, smoothness
What’s Next for PIWMs?
28
1.World models predict calibrated safety chance
2.Four principles: structuring, in/equivariance,
multi-level supervision, output structuring
3.Physical interpretability is achievable under
weak supervision with quantized latents