From Infrastructure to Insight: Technical Pathways to Value in Europe’s Compute Ecosystem.pdf
Mindtrek
0 views
16 slides
Oct 10, 2025
Slide 1 of 16
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
About This Presentation
By: Henrik Nortamo, Senior Application Specialist, CSC - IT Center for Science.
Size: 1.66 MB
Language: en
Added: Oct 10, 2025
Slides: 16 pages
Slide Content
From Infrastructure to Insight: Technical Pathways
to Value in Europe’s Compute Ecosystem
Henrik Nortamo, CSC –IT Center for science
MINDTREK CONFERENCE, 7.10.2025
5.10.20251
About
•CSC is a non-profit state
enterprise providing high-
quality ICT expert services
•High performance
computing(HPC) and data
ecosystem
oCurrently operating three
supercomputers in our data
center in Kajaani
•Services for education and
research
5.10.20252
•7+ years at CSC
•Mainly on the high performance
computing (HPC) side.
Everything from support,
simulation and software stack to
digital twins and webinterfaces
•Now leading the technical work
on a federation platform for
European HPC, AI and Quantum
systems
CSC –IT Center for Science Henrik Nortamo
Intro
•Both Europe and the rest of the world are
heavily investing in compute
oAI is one of the main drivers, but other
new paradigms like quantum are also in
focus
oAn increasingly data driven world will
most likely maintain the need for a large
amount of compute
•Severalreasons toextract the
maximalamount of value
oMonetary
oEnvironmental
oTemporal
5.10.20253
Supercomputers & High-Performance Computing 101
•Computing is an extremely broad term, covering everything from embeddedand edge
computingto super and high-performancecomputing.
oWe will focus on the last cases -> Big enough to filla room, gives of enoughheat to
warm asauna.
All you need to know about supercomputers
•They turn numbers into more useful numbers
•They are BIG:
oNumber of components
▪100K GPUs (Colossus) , 7M CPU cores (Fugaku)
oFloorspace
▪A big closet, a tennis court, or a whole warehouse
oCost
▪144 M€ (LUMI) , 600 M€ (Frontier), 4 G€ (Colossus)
oPower usage
▪Big: 20 MW (Frontier), Insane 280 MW (Colossus). For reference: Olkiluoto3 is
1600 MW
•They utilize specialized hardware
•On the public side they are usually shared
5.10.20254
Examples of European (EU + member countries)
compute investments
+ ~20 G€ Fund to build up to 5 AI Giga factories + probably a bunch of more regional and
national capacity which I'm not even aware of.
5.10.20255
The two paths we will be dipping our toes into
5.10.20256
Performance
Extracting rawperformance from
increasingly heterogeneous
systems
Maintaining some degree of
performance portability
Locality
Having data in the right place at
the right time
Accessibility
Providing easy access and usage
to a inherently distributed set of
resources
Hold on to your hats: In the spirit of the developers track, things are
about to become more concrete and background knowledge is helpful.
Performance
5.10.20257
Performance categories
•Generic compute
oStandard CPUs:x86 and ARM
•Specialized compute
oGPUs, TPUs, RISC-V accelerators
•Communication
oNetwork and internode communication
•Storage / IO
oReading and writing
•Scheduling
oWhen, where and how things should run
I will only be talking about the highlighted category
5.10.20258
Accelerator utilization
Most of the available compute power now comes in the form of some sort of
accelerators, mainly GPUs but tensor processing units (TPU) or inference hardware is
already common
Loads of different options to choose how to utilize
•Pragma based approaches (e.g. annotating source code + compiler magic)
oOpenMP and OpenACC
•Vendor specific programming interface (e.g. CUDA and HIP, for NVIDIA and AMD
respectively)
•Generic interfaces, e.g., OpenCL
•Standard parallelism
o-> compiler magic
•External libraries
•High-level frameworks
oE.g. Pytorch, Numba, Kokkos
•source-to-source translator (experimental)
If you are currently utilizing GPU:s, your stack probably includes several of these.
5.10.20259
Open source, source available, closed source
•Open source reduces the chance of vendor lock in
oBut does not fully remove it and is less meaningful without the capability to do changes
•Interfaces vs implementations
oNothing is fully open source, at some point you will encounter vendor specific proprietary
solutions
•Funding and system design
oLocal vs global optima
Two short stories
•AMDs HIP programming interface, interfaces and de facto standards
•Accelerating 3 million lines of Fortran on AMD GPUs, open standard and closed
implementation
5.10.202510
Accessibility
5.10.202511
Accessibility
•Compute resources
owned/partially funded by
one entity (EuroHPC) are
spread across very many
organizations countries
and systems.
•Not everyone is a
programming wizard, nor
do they need to be
-> Federation
-> Ease of use
But federating what ?
5.10.202512
Status quo:Separate identity, credentials, onboarding and access
methods for each system
Federation
•Identity
oNeeds to be secure (highly vetted), scalable and generic
oBoth hard and easy. E.g. keycloackwith social logins is
trivial. Standardized attributes and +5000 IDPs not so
much
•Access
oSeparate portals, separate credentials -> Separate
portals,same credentials -> Same portal same
credentials -> Automatic delegation. I.E. again, one stop
shop
•Allocations
oIn this context resources are not provisioned on the fly.
They are applied for and granted. Project and group
based.Current protocol: excel into a local project
management system
5.10.202513
What we are building, The EuroHPCFederation
Platform
A platform which will provide:
•Federated identity and Single-Sign-
On~6000 IDPs + National identity
providers (eIDAS) currently supported
•Resource allocation, management and
monitoring across systems.Includes
provisioning and user onboarding
•Direct access to systems and APIs
oSSH is still THE interface, but other APIs
gaining more ground
•+High level interfaces and GUI:s for data
management, workflows and interactive
computing
5.10.202514
Technical presentation:
https://archive.fosdem.org/2025/schedule/event/fosdem-2025-6718-
eurohpc-fp-a-federated-platform-for-hpc-infrastructure-in-europe-
built-with-open-source-software/
Building using open source
5.10.202515
•The major components of the platform are all based on open-source technologies which are already in
production use on several systems and have active communities.
oThis both increases the trust and makes Maintenace manageable with the resources we have
oFor some components we are doing heavy additions, not possible (or very expensive) if no source code access
•Contributions to existing opensource projects are upstreamedwhen possible
15