Reliable World Models: Physical Grounding and Safety Prediction

ivanruchkin 0 views 29 slides Oct 21, 2025
Slide 1
Slide 1 of 29
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29

About This Presentation

Presented by Ivan Ruchkin at the IROS 2025 "Building Safe Robots:A Holistic Integrated View on Safety from Modelling, Control & Implementation" Workshop on October 20, 2025.

Abstract:

Autonomous neural controllers pose critical safety challenges in modern robots. To support interve...


Slide Content

Ivan Ruchkin
Assistant Professor
Electrical & Computer Engineering
University of Florida


Building Safe Robots Workshop
IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS)
Hangzhou, China
October 20, 2025
Reliable World Models:
Physical Grounding and Safety Prediction
1

Subscribe to us
on LinkedIn!
—TEA Lab’s mission: make autonomous systems safer and trustworthier through
rigorous engineering methods with strong guarantees
○Across many domains: autonomous driving, mobile robots, UAVs/UUVs
—Research areas: formal methods, AI/ML, cyber-physical systems, robotics
○Theory → algorithms & tools → simulations+physical deployment

TEA Lab: Trustworthy ∃ngineered ∀utonomy
tea.ece.ufl.edu
2
Research platform:
small-scale racing cars
ivan.ece.ufl.edu
[email protected]fl.edu
Subscribe to us
on RedNote!

—Industry and government roles
○AFRL VFRP, NASA JPL, CMU SEI, Google Summer of Code, …
—2011–2018: PhD @ Carnegie Mellon University
○Pittsburgh, PA — software engineering & formal methods
—2006–2011: Undergrad @ Moscow State University
○Moscow, Russia — applied math & computer science
Ivan Ruchkin: brief bio
3

—2018–2022: Postdoc @ University of Pennsylvania
○Philadelphia, PA — guarantees for learning-enabled systems


—2022–present: Assistant professor @ University of Florida
○Gainesville, PA — trustworthy methods for autonomy

What is a World Model?
Observation x
t

Real-world
dynamics
Physical
state
Sensing

●Predictor of future observations from an initial observation and control actions
○Learned latent representation of the environment state
○Temporal model of the environment dynamics
World model

Physical
state
Observation x̂
t+1

Control policy
learning
Existing
controller
Action

➔Part 1: World Models for Safety Prediction
➔Part 2: Principles for Physically Interpretable World Models
➔Part 3: Physical Interpretability under Weak Supervision?
Agenda
5

➔Part 1: World Models for Safety Prediction
➔Part 2: Principles for Physically Interpretable World Models
➔Part 3: Physical Interpretability under Weak Supervision?
Agenda
6

Part 1: How Safe Will I Be Given What I Saw?
Problem:
Given a system that contains state (x), observations (y), and
a safety property φ (x):
-Is the system safe now?
φ (x
i
| y
i
)
-Will the system be safe after k steps?
φ ( x
i+k
| y
i
)
-What’s the probability of safety after k steps?
P( φ (x
i+k
) | y
i
)



Challenges:


-Unknown states and dynamical models
-High-dimensional (1000+) observations
-Distribution shift in predictions
-Difficult to reliably quantify confidence
Contributions:


-Modular family of online safety predictors
-Conformal confidence calibration
-Three benchmarks: racing car, cart pole, and donkey car
Predictor
Monolithic
predictor
Latent monolithic
predictor
Latent
forecaster

Part 1: How Safe Will I Be Given What I Saw?
Safety interval prediction:
1.Train the predictors to forecast future trajectories & estimate safety probabilities
2.Robustify to distribution shift with unsupervised domain adaptation (UDA)
3.Compute statistically valid intervals for safety probability using conformal calibration

C. Expected Calibration Error (ECE):
Calibration improves all architectures
Part 1: Results on DonkeyCar
Prediction Horizons
1.0
0.9
0.8
0.7
0.6
0 20 40 60 80 100
1.0
0.9
0.8
0.7
0.6
A. F1 Score: Monolithic (left) lose to Composite (right)
0 20 40 60 80 100
B. F1 Score: Attention-free (left) lose to Attention-based (right)
Results I
Prediction Horizons
0 20 40 60 80 1000 20 40 60 80 100
See for details:

Z. Mao, C. Sobolewski, and I. Ruchkin. “How Safe Am I
Given What I See? Calibrated Prediction of Safety Chances
for Image- Controlled Autonomy”, in Proc. of L4DC, 2024

Z. Mao, M. E. Umasudhan, and I. Ruchkin. "How Safe Will I
Be Given What I Saw? Calibrated Prediction of Safety
Chances for Image-Controlled Autonomy", arXiv preprint
2508.09346, 2025

1. Lack of Physical Interpretability
Latent representations are black boxes: we do not know the meaning of embeddings
2. Impossible Hallucinations
Predictors may hallucinate physically impossible states
3. Incompatible with Symbolic Techniques
Cannot perform state-based planning or formal verification
4. Weak Safety Guarantees
Hard to translate the guarantees from the latent space to the physical world
Limitations of Existing World Models
Observation x
t

Real-world
dynamics
Physical
state
Sensing

World model

Physical
state
Observation x̂
t+1

Control policy
learning
Existing
controller
Action

➔Part 1: World Models for Safety Prediction
➔Part 2: Principles for Physically Interpretable World Models
➔Part 3: Physical Interpretability under Weak Supervision?
Agenda
11

Towards Physically Interpretable World Models
b.Latent dynamics f simulate
physical processes
Real
dynamics
f
phys
Physical
variables
Latent
states
Learned
dynamics
f
z
t

z
t
phys
z
t+1
z
t+1
phys
a.Latent embeddings z correspond
to physical properties
12

Four Principles for Physical Interpretability
●Principle 1: Functionally organize the latent space
Observation x
t

Real-world
dynamics
Physical
state
Sensing

Physical
state
Observation x̂
t+1

Trajectory
prediction
Formal
verification
Control policy
learning
Existing
controller
Action

Physical
encoder

z
t
1


z
t
n
z
t
1


z
t
n
Interaction
encoder

z
t
1


z
t
n
Style
encoder

13

Principle 1: Functional Latent Organization
●Decouple the latent vector through separate encoders:
○Lets us use function-specific prediction mechanisms downstream
○Improves debuggability (style hallucinations → look at style branch)
●Suppose we aim to decouple latents that encode:
○Physical state of the ego (enc
1
(x))
○Relative states between egos (enc
2
(x))
○Residual, stylistic feature (enc
3
(x))
●The modular latent vectors becomes:
○z = [enc
1
(x) enc
2
(x) enc
3
(x)]
T

Observation x
t

Physical
encoder

z
t
1


z
t
n
z
t
1


z
t
n
Interaction
encoder

z
t
1


z
t
n
Style
encoder

14

Four Principles for Physical Interpretability
Observation x
t

Real-world
dynamics
Physical
state
Sensing

●Principle 2: Learn invariant and equivariant representations
Physical
state
Observation x̂
t+1

Formal
verification
Control policy
learning
Trajectory
prediction
Existing
controller
Action

Physical
encoder

z
t
1


z
t
n
z
t
1


z
t
n
Interaction
encoder

z
t
1


z
t
n
Physical
dynamics
Latent
predictor
Agent
“forces”
z
t+1
1


z
t+1
n
z
t+1
1


z
t+1
n
z
t+1
1


z
t+1
n
Style
encoder

15

Principle 2: I/O equivariance in encoders
●Observation transformations respect the branch’s intent, e.g.:
○Changes in image brightness should not affect the physical latents
○Changes in image brightness should affect the stylistic latents
●Enforce these I/O relationships through selective sampling:
○E.g., train on samples with varying image brightness but fixed state
●Let:
○g : X → X be an observation transformation (e.g., increase brightness)
parameterized by θ~Θ
○h : Z → Z be the respective latent transformation parameterized by φ~Φ
●Compute loss as:
16

Four Principles for Physical Interpretability
Observation x
t

Real-world
dynamics
Physical
state
Sensing

●Principle 3: Use multi-level and multi-strength supervision signals
Physical
state
Observation x̂
t+1

Trajectory
prediction
Formal
verification
Control policy
learning
Existing
controller
Action

Physical
encoder

z
t
1


z
t
n
z
t
1


z
t
n
Interaction
encoder

z
t
1


z
t
n
Physical
dynamics
Latent
predictor
Agent
“forces”
z
t+1
1


z
t+1
n
z
t+1
1


z
t+1
n
z
t+1
1


z
t+1
n
Style
encoder

17

Principle 3: Multi-level/strength Supervision
18
●Employ multi-level/strength supervision signals in end-to-end training
●Types of supervision to encourage physically interpretable latents:
○Strong: learn a physically meaningful latent space from labeled data
○Semi: refine the latent space with partially labeled data
○Weak: use noisy/coarse labels to enforce trajectory smoothness
○Self: generate a supervision signal from unlabeled data (contrastive loss)
●Combine supervision levels:
○Determine available training information (trajectories, state constraints)
○Engineer a superimposed training loss that enforces physical grounding

Four Principles for Physical Interpretability
Observation x
t

Real-world
dynamics
Physical
state
Sensing

●Principle 4: Partition generated observations
Physical
state
Observation x̂
t+1

Trajectory
prediction
Formal
verification
Control policy
learning
Existing
controller
Action

Physical
encoder

z
t
1


z
t
n
z
t
1


z
t
n
Interaction
encoder

z
t
1


z
t
n
Physical
dynamics
Latent
predictor
Agent
“forces”
z
t+1
1


z
t+1
n
z
t+1
1


z
t+1
n
z
t+1
1


z
t+1
n
Style
encoder

Object
decoder

Style
decoder

Principle 4: partitioning of
observations
19

Principle 4: Output Space Partitioning
20
●Segment the generated observations through a set of generators
●Typically, world models decode the predicted latent into a single observation
○Verifying the correctness of a monolithic decoder is challenging due to
the high-dimensional space of raw observations
●For scalable verification: design a generator that relates the physical latents to
a low-dimensional, physically meaningful segment of the observation
○E.g., velocity/position latents reconstruct the cart body in CartPole

Observation x
t

Real-world
dynamics
Physical
state
Sensing

Physical
state
Observation x̂
t+1

Trajectory
prediction
Formal
verification
Control policy
learning
Existing
controller
Action

Physical
encoder

z
t
1


z
t
n
z
t
1


z
t
n
Interaction
encoder

z
t
1


z
t
n
Physical
dynamics
Latent
predictor
Agent
“forces”
z
t+1
1


z
t+1
n
z
t+1
1


z
t+1
n
z
t+1
1


z
t+1
n
Style
encoder

Object
decoder

Style
decoder

Principle 4: partitioning of
observations
Principle 3: multi-level,
multi-strength supervision
Principle 2:
in/equivariance
Principle 1: latent
structuring
Summarizing the Four Principles
21
See for details: J. Peper*, Z. Mao*, Y. Geng, S. Pan, and I. Ruchkin. “Four Principles for Physically Interpretable World Models”,
In Proc. of NeuS 2025. * Co-first authors.

➔Part 1: World Models for Safety Prediction
➔Part 2: Principles for Physically Interpretable World Models
➔Part 3: Physical Interpretability under Weak Supervision?
Agenda
22

Physical Grounding under Weak Supevision?
23
Given a past observation sequence y
i-m:i
, can a predictor P
predict not only the future sequence y
i+1:i+k
,

but also
physically interpretable state representations z
i+1:i+k
?
Problem 2: Observation + interpretable state prediction
Can an autoencoder W encode observation y into a physically
interpretable representation z, given weak distributional
supervision in x?
Problem 1: Physically meaningful auto-encoding
Given observation y, can an autoencoder W
encode it into a physical interpretable
representation z given a weak supervision x.
Given past observation sequence y
i-m:i
, can
a predictor P predict the future sequence y
i+1:i+k
meanwhile can physically interpret the
predictions into meaningful states.
Problem 3: Parameter learning of dynamics
Given an interpretable sequence z
i-m:i
, can we learn the
parameters ψ of an unknown dynamics model ϕ?
Taxonomy of mappings between spaces:
x
State Space
y
Observation
Space
z

Interpretable
Latent
Space
z
*
Intermediate
Latent
Space
Intrinsic Autoencoding (one stage)
Extrinsic Autoencoding (two stages)

Unknown Mapping
Our Goal

Related Work on Physically Interpretable WMs
24
Given observation y, can an autoencoder W
encode it into a physical interpretable
representation z given a weak supervision x.
Given past observation sequence y
i-m:i
, can
a predictor P predict the future sequence y
i+1:i+k
meanwhile can physically interpret the
predictions into meaningful states.
DVBF (Deep Variational Bayes Filter) – Karl et al., 2016
Type: Extrinsic, continuous latent world model.
Method: Extends VAE with latent stochastic dynamics.
Limitation: Lacks explicit physical grounding; needs strong
supervision or access to full state variables.
Our Delta: PIWM replaces purely stochastic transitions with
partially known physical dynamics (ϕ) and aligns latent space to
physical quantities via weak distributional supervision, improving
interpretability and parameter recovery
SINDyC (Sparse Identification of Nonlinear Dynamics
with Control) – Brunton et al., 2016
Type: Classical dynamics discovery method.
Method: Learns explicit dynamics from state trajectories.
Limitation: Requires true state access, not feasible from
high-dimensional images.
Our Delta: PIWM operates directly from visual data and
recovers equivalent interpretable dynamics from weakly
supervised image sequences
Vid2Param – Asenov et al., 2019
Type: Intrinsic, continuous.
Method: Learns to regress physical parameters
directly from videos.
Limitation: Requires full supervision of physical
quantities, cannot perform multi-step prediction.
Our Delta: PIWM achieves parameter inference
and multi-step physical trajectory prediction using
only weak distributional supervision, removing
dependence on exact ground-truth physical labels
GOKU-Net – Linial et al., 2021
Type: Intrinsic, physics-aware.
Method: Combines generative ODEs with
known physical structures to infer unobserved
variables.
Limitation: Limited to known ODE systems
and supervised training on structured data.
Our Delta: PIWM generalizes this idea to
high-dimensional image observations and
unknown dynamics with partial physical priors
and weak supervision to guide latents.

Key Ideas for PIWM under Weak Supervision
25
Observationsy
t-m:t


Encoder
E

Static representation
like position, angle
Partially known
Dynamics Ф
(with learnable
parameter Ψ)


Dynamic representation
like velocity

Decoder
D

Controller
h
a

action
Observationsy
t+1

Latent z
*
t-m:t


Latent z

t+1
^
^
Physically Interpretable Latents: Learn latents directly aligned with real-world physical quantities, rather than abstract image features.
Weak Distributional Supervision: Replace exact physical labels with coarse distribution-based supervision to make training realistic.
Physics Knowledge Integration: Embed known physical equations into the model while learning only unknown parameters.
Extrinsic Two-Stage Design: Decouple visual perception and physical reasoning by first encoding images, then mapping to physical states.
Quantized Latent Regularization: Use discrete latent spaces to impose structure and stabilize learning under noisy supervision.
Unified Predictive Framework: Combine interpretable autoencoding and learnable dynamics for long-horizon trajectory prediction.

PIWM Results: Qualitative
26
Physically grounded > pixel-accurate:
Physically grounded models outperform purely visual baselines in both prediction and explanation quality.

PIWM Results: Quantitative
27
Quantized & extrinsic models win:
-Discrete extrinsic PIWM achieves the lowest prediction error
and most stable long-horizon performance.
-Separating perception from physical reasoning yields stronger
generalization and noise robustness.
-Quantization act as an effective structural prior, improving
consistency under noisy conditions.
Interpretability without precise supervision:
-PIWM recovers true physical parameters using
only weak supervision.
Intrinsic models remain competitive: Continuous intrinsic PIWM performs
surprisingly well, showing potential for efficient end-to-end training.
See for details: Z. Mao, M. E. Umasudhan, and I. Ruchkin.
“Can Weak Quantization Make World Models Physically
Interpretable?”, In submission to ICLR 2026.

-More complex physical systems
-Multimodality with correlated errors
-More principles: sparsity, causality, smoothness

What’s Next for PIWMs?
28

1.World models predict calibrated safety chance
2.Four principles: structuring, in/equivariance,
multi-level supervision, output structuring
3.Physical interpretability is achievable under
weak supervision with quantized latents

Summary
29