We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research comp...
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Size: 5.03 MB
Language: en
Added: May 31, 2024
Slides: 27 pages
Slide Content
Introduction to Globus Compute
Josh Bryan – [email protected]
May 8, 2024
1
Agenda
●What is Globus Compute?●Using the SDK●Installing single user compute endpoint●Multi-user Endpoints○Architecture○Installation○Configuration●Other topics○WebApp○Flows○MPI○Links and References●Thanks!
What is Globus Compute?
3
•FaaS for HPC
•Programmatic access
to compute resources
•“Fire and forget”
Reliable execution
•Consistent user
interface across
diverse execution
systems
What is Globus Compute?
4
•Compute service — Highly available cloud-hosted
service for managed, fire-and-forget function
execution
•Compute endpoint — Abstracts access to compute
resources, from edge device to supercomputer
•Compute SDK — Python interface for interacting with
the service, suing the familiar (to many) Globus look
and feel
What is Globus Compute?
5
A B
You request a function be
executed on endpoints A and B
1
2Globus Compute manages
the reliable and secure
execution on these endpoints
3Globus Compute returns results or
stores them until requested
Compute
Service
You register your code
(as a Python function)
0
A compute
resource
Another
compute
resource
Globus Compute transforms any computing
resource into a function serving endpoint
•Python pip installable agent
•Elastic resource provisioning
from local, cluster, or cloud
system (via Parsl)
•Parallel execution using local
fork or via common schedulers
–Slurm, PBS, LSF, Cobalt, K8s
6
Compute
Service
Using the SDK
7
import sys
from globus_compute_sdk import Executor
def jittery_multiply(a: float, b: int):
import random # Normal python function except imports must be inline
return a * b + random.uniform(-1.0, 1.0)
with Executor() as ex:
ex.endpoint_id = '00001111-2222-4444-8888-ffffffffffff' # my endpoint id
futs = [
ex.submit(jittery_multiply, float(a), b)
for a, b in zip(sys.argv[1:], range(100))
]
print([f.result() for f in futs]) # block until each result received
# $ python jittery_multiply.py 1 2 6.283
# [-0.031687527496242485, 2.0252510755026245, 13.39066106082362]
Using the SDK
8
https://bit.ly/gw-tut
Follow link to https://jupyter.demo.globus.org/
Open globus-jupyter-notebooks/Compute_Introduction.ipynb
Installing single user compute endpoint
9
$ pip install globus-compute-endpoint
$ globus-compute-endpoint configure gw-demo-endpoint
Created profile for endpoint named <gw-demo-endpoint>
Configuration file: /home/ec2-user/.globus_compute/gw-demo-
endpoint/config.yaml
Use the `start` subcommand to run it:
$ globus-compute-endpoint start gw-demo-endpoint
$ globus-compute-endpoint start gw-demo-endpoint
Starting endpoint; registered ID: 54460200-b652-4f43-a918-02882fa6114a
Debugging compute endpoint
11
[root@ip-192-168-0-8 ~]# globus-compute-endpoint self-diagnostic -z
Successfully created globus_compute_diagnostic_2024-05-08.txt.gz
# tail .globus_compute/benchmark_sep/endpoint.log -n 4
2024-05-06 22:00:49,301272 INFO MainProcess-33827 MainThread-139642425939776
globus_compute_endpoint.endpoint.interchange:195 quiesce Waiting for quiesce complete
2024-05-06 22:00:49,301461 INFO MainProcess-33827 MainThread-139642425939776
globus_compute_endpoint.endpoint.interchange:198 quiesce Quiesce done
2024-05-06 22:00:49,301652 WARNING MainProcess-33827 GCReportingThread-139642096944896
globus_compute_endpoint.engines.base:63 run_in_loop ReportingThread exiting
2024-05-06 22:01:46,400715 WARNING MainProcess-34798 MainThread-140307336914752
globus_compute_endpoint.endpoint.endpoint:436 start_endpoint Endpoint registration blocked.
[{"status":"Failed","code":"ENDPOINT_NOT_FOUND","error_args":["9c4b5428-1bde-4f6f-b0c3-
5ff4a630e58b"],"reason":"Endpoint 9c4b5428-1bde-4f6f-b0c3-5ff4a630e58b could not be
resolved","detail":"Endpoint 9c4b5428-1bde-4f6f-b0c3-5ff4a630e58b could not be
resolved","http_status_code":404}]
Multi-user Endpoints
12
Value-Add for Users
●No need to maintain
multiple endpoints for
different configurations
●Specify configuration at
task submission
●No need to log in to the
terminal
Value-Add for Admins
●Templatable User Endpoint
Configurations (Jinja)
○e.g., pre-choose SlurmProvider,
PBSProvider; enforce limits
●No orphaned user compute endpoints
○Enforced process tree
○Idle-endpoints are shutdown (per
template configuration)
●Standard Globus Identity Mapping
●Lower barrier for users
Multi-user Endpoints: Architecture
13
Globus ComputeExecutor Submit
([tasks], endpoint,
user endpoint config)
Node 1
Node …
Node n
Globus Compute
Engine
Start Endpoint
(identity id, uep id)[tasks]
Local user
TemplatesID map
1
2
[results]
3
Multi-User
Endpoint
User
Endpoint
…
Spawn
(identity)
↧
(local user)
Multi-user Endpoints: Installation
14
$ globus-compute-endpoint configure multi-user --multi-user
Created multi-user profile for endpoint named <multi-user-demo>
Configuration file: /root/.globus_compute/multi-user/config.yaml
Example identity mapping configuration: /root/.globus_compute/multi-user/example_identity_mapping_config.json
User endpoint configuration template: /root/.globus_compute/multi-user/user_config_template.yaml
User endpoint configuration schema: /root/.globus_compute/multi-user/user_config_schema.json
User endpoint environment variables: /root/.globus_compute/multi-user/user_environment.yaml
Use the `start` subcommand to run it:
$ globus-compute-endpoint start multi-user
$ sudo yum install https://downloads.globus.org/globus-connect-server/stable/installers/repo/rpm/globus-repo-
latest.noarch.rpm
$ sudo yum globus-compute-agent
Multi-user Endpoints: Configuration
20
[root@ip-192-168-0-8 ~]# globus-compute-endpoint enable-on-boot multi_user
Systemd service installed at /etc/systemd/system/globus-compute-endpoint-multi_user.service. Run
sudo systemctl enable globus-compute-endpoint-multi_user --now
Enable on boot
Multi-user Endpoints: Requirements
21
●Ports
○443 outbound (optionally 5671) for both
endpoint and sdk
○No inbound ports
●Memory
○~200Mb / active user
●Access to scheduler and shared filesystem
WebApp
22
Flows
23
•Flows Service: A platform for managed, secure,
reliable task orchestration
•Flows comprise Actions invoke Globus services;
extensible to support your own services
•Run via web app, CLI, API
MPI
24
●Will support arbitrarily sized MPI application
within one batch job
●https://parsl.readthedocs.io/en/stable/userguide/
mpi_apps.html
●Available in the next month
Links and References
●Try it:
○Go to https://jupyter.demo.globus.org/
○Open globus-jupyter-notebooks/Compute_Introduction.ipynb
●Docs:
○https://docs.globus.org/compute/
○https://globus-compute.readthedocs.io/en/latest/index.html
●GitHub:
○https://github.com/globus/globus-compute
25
Thanks to Compute and Parsl Teams
●Chris Janidlo
●Kevin Hunter Kesling
●Reid Mello
●Lei Wang
26
●Yadu Babuji
●Ryan Chard
●Ben Clifford