Globus Compute with Integrated Research Infrastructure (IRI) workflows

globusonline 15 views 10 slides May 29, 2024
Slide 1
Slide 1 of 10
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10

About This Presentation

As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating wa...


Slide Content

1
Globus Compute with
IRI Workflows
Nick Tyler
Data & AI Services

NERSC Superfacility

3
•National Energy Scientific Research
Scientific Computing Center
oLawrence Berkeley National Laboratory
o50th Anniversary!
oHome to Perlmutter Supercomputer
•Superfacility
oSuperfacility API
•REST API for interacting with HPC
•Now with python client!
oSpin
•Kubernetes based service platform
oDTNs and connectivity with ESNet
•Globus Data Transfers
•Sharing results with the world
NERSC and Superfacility
Download Perlmutter using Globus!

4
•Multi-site feedback loops
oExpertise at experimental facility
oComputational power of HPC
•NERSC, ALCF, OLCF
oESNet to connect facilities
•Give the scientists real time feedback
oPowered by data
oAdd new techniques
•AI modeling for detector simulation
•Goal is to empower scientists!

NERSC Superfacility and IRI
Simulation
Readout
Data
Data
Analysis
Feedback
To empower researchers to meld DOE’s world-class research tools, infrastructure, and user
facilities seamlessly and securely in novel ways to radically accelerate discovery and innovation
- Ben Brown (ASCR, DOE)

5
DIII-D and NERSC
•Get CAKE workflow running at NERSC
oSome parts need to be run on GA side
•Need to interact with databases
•Get readings from instruments
oUse computational power of NERSC
•ONETWO
•Ion pressure and density
•EFIT
•Equilibrium fitting code
•Current setup uses a lot of tunnels
oOpening ports can be difficult
•Depends on the site
oCall out to Globus services
•Outbound http connections

Globus Compute for DIII-D

7
DIII-D
Control
Room
DIII-D and NERSC
Workflow
Manager
on
Compute
Slurm/PBS
Job Array
Database
DIII-D
Control
Room
Workflow
Manager
on local
Compute
Database
•Simplify cross site
communications
•Takes array of jobs and
executes each task as a
Globus Compute Task
•Still just a working proof of
concept
•Would need to be fully
integrated into workflow tool

8
How it works
•Each task is a single line
oWrap each line in a
simple bash script
oRuns the script
oCapture output
•Each line is built to match
the site
omodules
opaths

9
How it works
•Create object from Globus
Compute Endpoint ID
oAdd the lines to run
oRun the lines
oGet back stdout, stderr
•Run creates list of Globus
Compute Futures
oPull the results from Globus
oAs results are completed

10
Future Work with IRI DIII-D and Globus
•Integrate Globus Compute into CAKE
workflow
oWaiting on funding and time
•A different workflow
oIonorb
oUsed to simulate the shot
•Predictive analysis before next shot
•Make sure the next planned shot won’t
destroy the machine
•Christine at ALCF developed Globus Flow
oDemo of Globus Flows at SC
oOptimizing it for DIII-D Workflow


LBL: Laurie Stephey, Eli Dart, Sam Williams, Pengfei Ding
ALCF: Christine Simpson, Tom Uram, Bill Allcock, Mike Papka
DIII-D: Sterling Smith, David Schissel, Torin Bechtel, Severin
Denk, Mark Kostuk, Juan Diego Colmenares