09 Making Large-scale Agent-based Models Fast.pdf

JooMarceloBrazoProtz 7 views 50 slides Mar 12, 2025
Slide 1
Slide 1 of 50
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50

About This Presentation

Making Large-scale Agent-based Models Fast


Slide Content

© Uni Münster

Jan Lehmann
Making Large-scale Agent-based Models Fast
Learnings From Developing GEMS
Johannes Ponge
MONID Summer School 2024

Spoiler: This will be the very last slide

Background: The OptimAgentProject
•Focus on public health decision making
•Large-scale model (full German population, ~84m)
•Project team consisting of 14 universities and Institutions
•Goal: Enable simulation-based evaluation of NPI-effectiveness*
in heterogeneous populations and provide estimations for
associated economic-, social-, and health costs.
*Non-pharmaceutical interventions

OptimAgentCollaborators

SP4
Population Model
SP2
Behavioral
Adaptation
SP5
Simulation
(GEMS)
SP6
Policy Making
SP3
Epidemiologic
Parametrization
SP1
Contact
Parametrization

6
The German Epidemic Microsimulation System
+ Contact Structures
+ Behavioral Adaptation
+ Intervention Planning
+ Economic Aspects
+ Open Julia Package
+ High Performance
+ 80m Population Model
Written in
Simulating one year of
infection dynamics with
80 million agents takes
roughly 30 minutes*
*On a 24-core server with 128GB Memory

The German Epidemic Microsimulation System
Population Model
Contact Model
Disease Model
Scenario Model

With the complexityand sizeof GEMS, runtime is one of THEcritical success factors.
Therefore, we started thinking about how to build an efficientmodel early on.

But How to Make Code Fast?
Do as few calculations as possible
Do the remaining ones in parallel
S
1S
2S
3S
4S
5
Running multiple concurrent
instances of the simulation
Simulation
”True” parallelized
simulations
Parallelcomputing:
Carrying out calculations simultaneously
Prerequisite:
Causal independence of individual tasks

But How to Make Code Fast?
S
1S
2S
3S
4S
5
Running multiple concurrent
instances of the simulation
S
1
”True” parallelized
simulations
Natural causal independence
Individual runs cannot be sped up
Scalaberuntime. More cores = faster execution
Implementation is complex
We decided to
try building a
parallel ABM

Two Views
Conceptual View Technical View
How to designthe model How to codethe model
This you do
throughout the
project
This you do
before writing a
single line of code
Building a model first and go “we‘ll make it fast later…” is a common pitfall

https://www.amazon.com/Have-Baby-Month-Nine-Women/dp/B096TTV3F9
If parallelization is part of
your “making-it-fast”-scheme
you need to organize your
problem (model) in a way
that can benefit from
distributing workloads

Conceptual View Technical View
How to designthe model How to codethe model
A thought-through conceptual design saves you a lot of headaches later on

The First Thing We Did Back in 2022
Two-day workshop with the OptimAgentconsortium
(infectious disease modelers, physicians, policy-makers, psychologists,
computer scientists, health economists, physicists, mathematicians,…)
Main question: What featuresdoes such an ABM for
public health decision-support need?

Collecting Requirements
Based on the workshop, we
derived 113 required features of
the framework across 23
categories grouped by:
•ProjectFeatures
•SystemFeatures
•ModelFeatures
•SimulationFeatures

We Then Did This For the Next Six Months…

Turns Out, Building a Fast Model of That Size and Complexity is Hard
•After weeks, we produced four variants that we considered viable
•However, they all had pros and cons and were mutually exclusive
•We struggled to agree on one of the options and decided to develop an evaluation framework to objectify
the choice for a parallelization strategy and the implications for the conceptual model along seven criteria:
Single-core
Performance
Load
Balancing
Memory
Efficiency
Model
Flexibility
Synchronization
Overhead
Scalability
Implementation
Complexity
Ponge, Johannes, et al. "Evaluating Parallelization Strategies for Large-Scale Individual-based Infectious Disease Simulations." 2023 Winter Simulation Conference (WSC). IEEE, 2023.

Some „Preflight“ Considerations We Initially Agreed On
•Simulating individual mobility trajectories requires extensive computational resources
•Infections are driven by contacts, not (necessarily) by movement
•Covid infections were 18 times more likely to happen indoors
•We simulate contacts in so-called settings(e.g.households or workplaces) and disregard physical mobility
•Individuals can be assigned to mobility hubs (e.g.airports) to account for “long-distance infections”
•Our model is an individual-based microsimulation with a superimposed network structure

2
3
1
7
10
8
9
5
6
4
Model Layout
Nodes represent individuals
Infectious Individual
Edges represent contextualized
transmission opportunities
(i.e. “household contact”)
“workplace contact”

Discrete Event Simulation
Event Queue
Time Action
2
3
1
7
10
8
9
5
6
4
01 | Infect Individual 9
04 | Infect Individual 217
07 | Infect Individual 3592

04 | Infect Individual 217
Discrete Event Simulation
Event Queue
01 | Infect Individual 9
Time Action
2
3
1
7
10
8
9
5
6
4
07 | Infect Individual 3592
Processing event:

04 | Infect Individual 217
Discrete Event Simulation
Event Queue
01 | Infect Individual 9
Time Action
2
3
1
7
10
8
9
5
6
4
07 | Infect Individual 3592
Processing event:
Generate new events:
03 | Infect Individual 5
10 | Recover Individual 9

03 | Infect Individual 5
04 | Infect Individual 217
Discrete Event Simulation
Event Queue
Time Action
2
3
1
7
10
8
9
5
6
4
07 | Infect Individual 3592
10 | Recover Individual 9
Enqueue new events
Very efficient; only simulating infectious individuals
Hard to parallelize; events must be processed in order

2
3
1
7
10
8
9
5
6
4
→Process events in a “safe” time window ??????
??????
Parallelization Strategies | Conservative Discrete Event Simulation (CDES)
Event Queue 1 Event Queue 2
A
Event Queue 3
B
C
D
→Segment the model „geographically“
→Give each segment its own event queue
→Threads exchange events through messages
Thread 2Thread 1 Thread 3

D
C
Parallelization Strategies | Conservative Discrete Event Simulation (CDES)
Event Queue 1 Event Queue 3
B
Event Queue 2
→Process events in a “safe” time window ??????
??????
→Segment the model „geographically“
→Give each segment its own event queue
→Threads exchange events through messages
2
3
1
7
10
8
9
5
6
4
Thread 2Thread 1 Thread 3
Parallelization with few to none conflicts as threads wait for each other
Load-balancing might be a serious issue

2
3
1
7
10
8
9
5
6
4
Parallelization Strategies | Optimistic Discrete Event Simulation (ODES)
Event Queue 1 Event Queue 2
A
Event Queue 3
B
C
D
→Give each section its own “history stack”
Thread 2Thread 1 Thread 3
History Stack 1 History Stack 2 History Stack 3

2
3
1
7
10
8
9
5
6
4
Parallelization Strategies | Optimistic Discrete Event Simulation (ODES)
Event Queue 1 Event Queue 2
A
Event Queue 3
B
C
D
→Give each section its own “history stack”
→Store events after processing
Thread 2Thread 1 Thread 3
History Stack 1 History Stack 2 History Stack 3
→Message events as they appear; no time-sync

2
3
1
7
10
8
9
5
6
4
Parallelization Strategies | Optimistic Discrete Event Simulation (ODES)
Event Queue 1 Event Queue 2
A
Event Queue 3
B
C
D
→Give each section its own “history stack”
→Store events after processing
Thread 2Thread 1 Thread 3
History Stack 1 History Stack 2 History Stack 3
→Message events as they appear; no time-sync
→“Rollback“ events to resolve conflicts
ROLLBACK
No idling threads
Expensive rollbacks; requires large memory

2
3
1
7
10
8
9
5
6
4
Parallelization Strategies | Generative Turns Fixed-Time Intervals (GTFTI)
→Tick-based time model (e.g. 1 tick = 1 day)
??????
??????
→Loop over all (susceptible) individuals in parallel
→Evaluate their infection risk based on their contacts
→Update their state in a new generation of the model
Near perfect parallelization
Many irrelevant calculations;
requires twice as much memory
2
3
1
7
10
8
9
5
6
4
??????
??????+1

Parallelization Strategies | Active Settings in a Fixed-Time-Interval (ASFTI)
Household
Layer
Workplace
Layer
School
Layer
→Tick-based time model (e.g. 1 tick = 1 day)
→Partition model by “setting” type
→Marking settings with infectious individuals “active”
→Calculate infections (in parallel) for active settings
Much fewer irrelevant calculations
Inducesmodel assumptions on
disjunctive nature of settings
The Active Setting approach allows us to “push” infections (instead of “pulling” them) while still being able to parallelize theinfection procedure. We discuss these ideas in:
Ponge, Johannes, et al. "Evaluating Parallelization Strategies for Large-Scale Individual-based Infectious Disease Simulations." 2023 Winter Simulation Conference (WSC). IEEE, 2023.

Results
Criteria/
Strategy
Sg.-Core
Perf.
Load
Balanc.
Memory
Eff.
Model
Flex.
Sync.
Overh.
Scala-
bility
Implem.
Comp.
Sum
CDES 3 1 3 3 2 2 2 16
ODES 3 1 1 3 1 2 1 12
GTFTI 1 3 1 2 3 3 3 16
ASFTI 2 3 2 2 2 3 2 16
Sum 9 8 7 10 8 10 8
No strategy that
„does it all“
The Active Settings in a Fixed-
Time-Interval (ASFTI) approach is
the only strategy without a “non-
acceptable” ratingin any category
1 point (non-acceptable losses*) to 3 points (little to no loss*)
*compared to a non-parallel implementation

The presence of an
infectious individual
“activates” a setting
Settings without
infectious individuals are
considered “inactive”; To
improve performance, we
do not simulate them
We Eventually Adopted the Active-Setting Approach for GEMS

Infection Procedure
Iterate over all active
settings
Iterate over all
infected individuals
(in that setting)
Draw contacts from
setting-dependent
contact structure
Evaluate contact
infection probability
Evaluate setting
infection probability

Conceptual View Technical View
How to designthe model How to codethe model

Choice of Programing Language
•The choice of programming language is
probably one of the most important when
it comes to runtime.
•While languages like R and Python are the
most common candidates (There’s a Stack
Overflow article for literally any problem
you might have), they are not particularly
“fast”.
•We decided to write GEMS in Julia.
Gao, Kaifeng, et al. "Julia language in machine learning: Algorithms, applications, and open issues." Computer Science Review37(2020): 100254.

Thinking Vectors
Example: You have a dataframewhich holds the ages and infection statuses of
10 million agents. You now want to extract the ages of infected individuals.
df=DataFrame(
age =rand(1:100, 10_000_000),
infected =rand(Bool, 10_000_000))
res =[]
foriin1:nrow(df) # iterate through dataframe
age =df[i,1]
inf =df[i,2]
ifinf ==true
push!(res, age) # extract age to result array, if infected
end
end
Naiveapproach:
Benchmark: 5.103593 seconds

res =filter(row ->row.infected==true, df).age;
Thinking Vectors
Betterapproach:
Using inbuilt methods
(such as filter) which are
often runtime optimized
Benchmark: 1.617579 seconds
res = df.age[df.infected]
Bestapproach:
Using pure vector
operations
Benchmark: 0.051579 seconds

Vectorization
•Modern CPUs can handle vectorizedinstructions.
•This is an effective way of achieving parallelismeven with
a single-core machine.
•However, your code must be organizedin way that can
benefit from vectorization.
https://www.dornerworks.com/blog/write-vectorized-code-and-optimize-your-cpu-performance/

Object Oriented Implementation vs. Vector-Based Implementation
“For further efficiency, agents are not
represented as individual objects, but rather
as indices of one-dimensional state arrays.
This avoids the need to use an explicit for-
loop over each agent on every integration
timestep, increasing performance by more
than an order of magnitude.”
-Kerr et al. 2021
Kerr, Cliff C., et al. "Covasim: an agent-based model of COVID-19 dynamics and interventions."PLOS Computational Biology17.7 (2021): e1009149.

Object Oriented Implementation vs. Vector-Based Implementation
Kerr, Cliff C., et al. "Covasim: an agent-based model of COVID-19 dynamics and interventions."PLOS Computational Biology17.7 (2021): e1009149.
This implementation only works if you have
no nested objects of arbitrary length
Let’s say, you want to store the
agents’ vaccination history:
Person A
-age: 65
- vaccinations
- day 10
- day 46
- day 205
Person B
-age: 17
- vaccinations
- day 35
- day 68
Person C
-age: 6
- vaccinations
How to “flatten” these
nested objects into
state-arrays?

Preventing Bottlenecks During Parallelization
Example: You have a vector of agent ages and want
to count the number of people younger than 20.
ages =rand(1:100, 10_000_000)
cnt=0
foriinages
ifi<=20
cnt+=1
end
end
Naiveapproach:
Benchmark: 0.957924 seconds

Preventing Bottlenecks During Parallelization
cnt=[0]
rlock=ReentrantLock()
Threads.@threadsforiinages
ifi<=20
lock(rlock) do
cnt[1] =cnt[1] +1
end
end
end
Parallelapproach:
Benchmark: 1.098423 seconds
Add a “lock” to prevent
multiple threads (cores)
accessing the same variable
simultaneously
Parallelize the loop

Preventing Bottlenecks During Parallelization
cnt=zeros(Int, Threads.nthreads())
Threads.@threadsforiinages
ifi<=20
cnt[Threads.threadid()] +=1
end
end
sum(cnt)
Bottleneck-freeapproach:
Benchmark: 0.133476 seconds
Instantiate a vector of
zerosfor the number of
threads (cores) you have
Let each thread use
its own counter
Finally, sum up all
counters

Using Profilers
Profilers visualizehow much the
individual parts of your code
contribute to the overall runtime.
It also visualizes where runtime is
caused by “fixable” problems (red
and yellow areas).
Total Runtime
Infect
Individual
Sample
Contacts
Collect
“Present”
Individuals
Calculate
Disease
Progression
Update an Individual’s Disease State
GEMS Profile (single core)
@profviewrun!(sim)

Book Recommendation
(if you plan to use Julia…)
Avik Sengupta, “Julia High Performance”, PacktPublishing, ISBN: 978-1788298117

Conceptual View Technical View
How to designthe model How to codethe model

Conclusion –Conceptual View
•Get a thorough understanding of what you want to achieve with your model before starting to design it
•Design a fast model rather than design a model and “make-it-fast-later”
•Understand the implicationsof your model architecture choice. Are you “closing” certain doors?
•Exploit characteristics exclusive to infectious disease models (e.g., by using incubation periods as slack)
•Discuss your ideas with as many peersas possible ☺

Conclusion –Technical View
•Think in vectors
•Work with flat data and dataframes
•Use inbuiltfunctions
•Prevent loops
•Prevent bottlenecks
•Prevent memoryallocations
•Use benchmarkingtools
•Use profilers

Main Takeaway
•If you want a fast model,
•Do as few calculations as possible
•Do the remaining ones in parallel

© Uni Münster

Jan Lehmann
Johannes Ponge
[email protected]
Tags