First Steps with Globus Compute Multi-User Endpoints
globusonline
35 views
11 slides
May 31, 2024
Slide 1 of 11
1
2
3
4
5
6
7
8
9
10
11
About This Presentation
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in...
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
Size: 2.75 MB
Language: en
Added: May 31, 2024
Slides: 11 pages
Slide Content
New Zealand eScience Infrastructure
New Zealand eScience Infrastructure
First steps with Globus Compute
multi-user endpoint
Chris Scott (NeSI) [email protected]
New Zealand eScience Infrastructure
New Zealand eScience Infrastructure
●Background about NeSI
●Details about our use case for Globus Compute
●Some challenges encountered
●Early experience with multi-user endpoint
Outline
New Zealand eScience Infrastructure
Andrew Chen (Engineering)
Using NeSI supercomputers for advancing image
processing capabilities using computer vision
Dr Olaf Morgenstern
and Dr Erik Behrens
(Earth Science)
Deep South Challenge
project using NeSI
supercomputers
for climate modelling.
Yoshihiro Kaneko
(Seismology)
GNS Science using NeSI
supercomputers to
recreate earthquake
events to better
understand their
processes and
aftermath effects.
Dr Kim Handley
(Biological
Sciences)
Genomics Aotearoa
project using NeSI
supercomputers to
better understand
environmental
processes on a
microbial level
Dr Sarah Masters,
Dr Deborah Crittenden,
Nathaniel Gunby (Chemistry)
Using NeSI supercomputers to
develop new analysis tools for
studying molecules’ properties.
Dr Richie Poulton
(Psychology)
Using NeSI Data
Transfer platform to
send MRI scan images
from Dunedin
Multidisciplinary Health
& Development Study
Research Unit to a
partner laboratory in
the United States for
analysis.
NeSI is a national
collaboration of:
New Zealand eScience Infrastructure
New Zealand eScience Infrastructure
NeSI
Services
High performance computing (HPC) and data analytics
•Fit-for-purpose national HPC platform (CPUs, GPUs) including specialised
software and tools for data analytics
•Flexible environments for research DevOps / research software engineers
•Hybrid eResearch platforms for institutions and communities
Data services
•High speed, secure data transfer with end-to-end integration
•Hosting of large actively used research datasets, repositories, and archives
Training and researcher skill development
•In-person and online training to grow capabilities in NZ research sector
•Partnership with The Carpentries (global programme to teach foundational
coding and data science skills to researchers)
Consultancy
•Embed Research Software Engineers into research teams to lift their
computational capabilities, optimise tools & workflows, develop code and more
New Zealand eScience Infrastructure
Mahuika and Māui are housed inside a purpose-built
High Performance Computing Facility in Wellington.
New Zealand eScience Infrastructure
New Zealand eScience Infrastructure
●Pharmacology group at the University of Auckland
○Improving understanding of medicines in humans
and improving dosing
○Want to manage their research from familiar
Windows environment
○Parts of their workflow require significant resources
■Non-linear mixed effects modelling (NONMEM)
■Fortran MPI code
■Bootstrapping
Use case
New Zealand eScience Infrastructure
New Zealand eScience Infrastructure
●Implemented in Python using Globus SDKs (transfer
and compute)
●Provides an API that Wings for NONMEM (wfn)
interfaces with (compatible with previous version):
○Upload files and submit Slurm jobs
○Wait for Slurm jobs to complete and download files
●Globus HTTPS transfer to Guest collection
●Globus compute running on login node (user managed)
●Globus authentication - token based
https://github.com/chrisdjscott/RemoteJobManager
Remote Job
Manager tool
New Zealand eScience Infrastructure
New Zealand eScience Infrastructure
●Each researcher has to manage their own endpoint
○Start it, restart it if it gets into a bad state (e.g.
caused by instabilities on our platform), login node
gets rebooted, etc.
○Each user has a scron job that checks state of the
endpoint every hour and restarts it if necessary
○Lot of extra complexity and places for things to go
wrong
Notes about
current use of
Globus
Compute
New Zealand eScience Infrastructure
New Zealand eScience Infrastructure
●Using LocalProvider instead of SlurmProvider
○Resources vary between NONMEM simulations
(time, cores, memory) but endpoint resources are
hardcoded when you create the endpoint
○WFN writes a template Slurm script that gets
uploaded and submitted
○Poll for Slurm job completion
○Extra code/complexity that we wouldn’t need if
could use the SlurmProvider
Notes about
current use of
Globus
Compute
New Zealand eScience Infrastructure
New Zealand eScience Infrastructure
●Early stages of testing the multi-user endpoint
●Two changes that make things much better for us
○Centrally managed - we run the endpoint as a service
and give the researchers the endpoint id
○Endpoint resources are no longer hard-coded
■Can use templating in the endpoint config to
allow researchers to pass in certain parameters
when they call the function
Multi-user
endpoint
New Zealand eScience Infrastructure
New Zealand eScience Infrastructure
●Multi-user endpoint looks like it will work really well
for this group
○Centrally managed service (users don’t have to
run anything themselves)
○Can switch to SlurmProvider, will simplify things
●Next steps are to continue testing with the wider
team and update RJM to use multi-user endpoint so
the researchers can benefit from it