First Steps with Globus Compute Multi-User Endpoints

globusonline 35 views 11 slides May 31, 2024
Slide 1
Slide 1 of 11
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11

About This Presentation

In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in...


Slide Content

New Zealand eScience Infrastructure

New Zealand eScience Infrastructure

First steps with Globus Compute
multi-user endpoint
Chris Scott (NeSI)
[email protected]

New Zealand eScience Infrastructure


New Zealand eScience Infrastructure

●Background about NeSI
●Details about our use case for Globus Compute
●Some challenges encountered
●Early experience with multi-user endpoint

Outline

New Zealand eScience Infrastructure

Andrew Chen (Engineering)
Using NeSI supercomputers for advancing image
processing capabilities using computer vision
Dr Olaf Morgenstern
and Dr Erik Behrens
(Earth Science)
Deep South Challenge
project using NeSI
supercomputers
for climate modelling.
Yoshihiro Kaneko
(Seismology)
GNS Science using NeSI
supercomputers to
recreate earthquake
events to better
understand their
processes and
aftermath effects.
Dr Kim Handley
(Biological
Sciences)
Genomics Aotearoa
project using NeSI
supercomputers to
better understand
environmental
processes on a
microbial level
Dr Sarah Masters,
Dr Deborah Crittenden,
Nathaniel Gunby (Chemistry)
Using NeSI supercomputers to
develop new analysis tools for
studying molecules’ properties.
Dr Richie Poulton
(Psychology)
Using NeSI Data
Transfer platform to
send MRI scan images
from Dunedin
Multidisciplinary Health
& Development Study
Research Unit to a
partner laboratory in
the United States for
analysis.
NeSI is a national
collaboration of:

New Zealand eScience Infrastructure


New Zealand eScience Infrastructure

NeSI
Services
High performance computing (HPC) and data analytics
•Fit-for-purpose national HPC platform (CPUs, GPUs) including specialised
software and tools for data analytics
•Flexible environments for research DevOps / research software engineers
•Hybrid eResearch platforms for institutions and communities

Data services
•High speed, secure data transfer with end-to-end integration
•Hosting of large actively used research datasets, repositories, and archives

Training and researcher skill development
•In-person and online training to grow capabilities in NZ research sector
•Partnership with The Carpentries (global programme to teach foundational
coding and data science skills to researchers)

Consultancy
•Embed Research Software Engineers into research teams to lift their
computational capabilities, optimise tools & workflows, develop code and more

New Zealand eScience Infrastructure

Mahuika and Māui are housed inside a purpose-built
High Performance Computing Facility in Wellington.

New Zealand eScience Infrastructure


New Zealand eScience Infrastructure

●Pharmacology group at the University of Auckland
○Improving understanding of medicines in humans
and improving dosing
○Want to manage their research from familiar
Windows environment
○Parts of their workflow require significant resources
■Non-linear mixed effects modelling (NONMEM)
■Fortran MPI code
■Bootstrapping

Use case

New Zealand eScience Infrastructure


New Zealand eScience Infrastructure

●Implemented in Python using Globus SDKs (transfer
and compute)
●Provides an API that Wings for NONMEM (wfn)
interfaces with (compatible with previous version):
○Upload files and submit Slurm jobs
○Wait for Slurm jobs to complete and download files
●Globus HTTPS transfer to Guest collection
●Globus compute running on login node (user managed)
●Globus authentication - token based

https://github.com/chrisdjscott/RemoteJobManager

Remote Job
Manager tool

New Zealand eScience Infrastructure


New Zealand eScience Infrastructure

●Each researcher has to manage their own endpoint
○Start it, restart it if it gets into a bad state (e.g.
caused by instabilities on our platform), login node
gets rebooted, etc.
○Each user has a scron job that checks state of the
endpoint every hour and restarts it if necessary
○Lot of extra complexity and places for things to go
wrong

Notes about
current use of
Globus
Compute

New Zealand eScience Infrastructure


New Zealand eScience Infrastructure

●Using LocalProvider instead of SlurmProvider
○Resources vary between NONMEM simulations
(time, cores, memory) but endpoint resources are
hardcoded when you create the endpoint
○WFN writes a template Slurm script that gets
uploaded and submitted
○Poll for Slurm job completion
○Extra code/complexity that we wouldn’t need if
could use the SlurmProvider

Notes about
current use of
Globus
Compute

New Zealand eScience Infrastructure


New Zealand eScience Infrastructure

●Early stages of testing the multi-user endpoint
●Two changes that make things much better for us
○Centrally managed - we run the endpoint as a service
and give the researchers the endpoint id
○Endpoint resources are no longer hard-coded
■Can use templating in the endpoint config to
allow researchers to pass in certain parameters
when they call the function
Multi-user
endpoint

New Zealand eScience Infrastructure


New Zealand eScience Infrastructure

●Multi-user endpoint looks like it will work really well
for this group
○Centrally managed service (users don’t have to
run anything themselves)
○Can switch to SlurmProvider, will simplify things
●Next steps are to continue testing with the wider
team and update RJM to use multi-user endpoint so
the researchers can benefit from it



Summary