Linac Coherent Light Source (LCLS) Data Transfer Requirements
insideHPC
473 views
25 slides
Feb 25, 2018
Slide 1 of 25
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
About This Presentation
In this deck from the Stanford HPC Conference, Les Cottrell from the SLAC National Accelerator Laboratory, at Stanford University presents: Linac Coherent Light Source (LCLS) Data Transfer Requirements.
"Funded by the U.S. Department of Energy (DOE) the LCLS is the world’s first har...
In this deck from the Stanford HPC Conference, Les Cottrell from the SLAC National Accelerator Laboratory, at Stanford University presents: Linac Coherent Light Source (LCLS) Data Transfer Requirements.
"Funded by the U.S. Department of Energy (DOE) the LCLS is the world’s first hard X-ray free-electron laser. Its strobe-like pulses are just a few millionths of a billionth of a second long, and a billion times brighter than previous X-ray sources. Scientists use LCLS to take crisp pictures of atomic motions, watch chemical reactions unfold, probe the properties of materials and explore fundamental processes in living things.
Its performance to date, over the first few years of operation, has already provided a breathtaking array of world-leading results, published in the most prestigious academic journals and has inspired other XFEL facilities to be commissioned around the world.
LCLS-II will build from the success of LCLS to ensure that the U.S. maintains a world-leading capability for advanced research in chemistry, materials, biology and energy. It is planned to see first light in 2020.
LCLS-II will provide a major jump in capability – moving from 120 pulses per second to 1 million pulses per second. This will enable researchers to perform experiments in a wide range of fields that are now impossible. The unique capabilities of LCLS-II will yield a host of discoveries to advance technology, new energy solutions and our quality of life.
Analysis of the data will require transporting huge amounts of data from SLAC to supercomputers at other sites to provide near real-time analysis results and feedback to the experiments.
The talk will introduce LCLS and LCLS-II with a short video, discuss its data reduction, collection, data transfer needs and current progress in meeting these needs."
Watch the video: https://youtu.be/LkwwGh7YdPI
Learn more: https://www6.slac.stanford.edu/
and
http://hpcadvisorycouncil.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Size: 2.66 MB
Language: en
Added: Feb 25, 2018
Slides: 25 pages
Slide Content
Dr. Les Cottrell,
SLAC
<[email protected]>
LinacCoherent Light Source (LCLS)
Data Transfer Requirements
HPC talk
Stanford Feb 2018
2
LCLS-II, a major (~ B$) upgrade to LCLS is
currently underway. Online in 2020.
Video
Example experiment #1
‘Molecular Movie’ Captures Ultrafast Chemistry in Motion
Scientific Achievement
Time-resolved observation of
an evolving chemical reaction
triggered by light.
Method
LCLS X-ray pulses were
delivered at different time
intervals, measuring
thestructural changes on an
X-ray area detector.
M.P.Minitti,J.M.Budarz,etal.,Phys.Rev.Lett.,114,255501(2015)(COVERARTICLE)
Significance and Impact
ResultspavethewayforawiderangeofX-raystudiesexamininggasphasechemistry
andthestructuraldynamicsassociatedwiththechemicalreactionstheyundergo.
4
5
Next example: Catalysis
Another related example
Polluting
gases
Porous substrate
coated with
precious metals
Exhaust gases
react with
precious metals
H2O, CO2
Catalyst
HC, CO, NOX
50% of the world’s gasoline goes through fluid catalytic cracking
Example experiment #2: Catalytic converter –
transient dynamics resolved at the atomic scale
•Surface catalysis of CO oxidation to CO
2
•Sub-picosecond transient states, monitored
via appearance of new electronic states in
the O K-edge x-ray absorption spectrum.
H. Öström et al., Science2015
6
Data Analytics for high repetition rate Free Electron Lasers
7
FEL data challenge:
●Ultrafast X-ray pulsesfrom LCLS are
used like flashes from a high-speed
strobe light, producing stop-action
movies of atoms and molecules
●Both data processingand scientific
interpretationdemand intensive
computational analysis
LCLS-II represents SLAC’s largest data challenge
LCLS-II will increase data throughput by orders of magnitudeby 2026,
creating an exceptional scientific computing challenge
8
LCLS-II experiments will present challenging computing
requirements, in addition to the capacity increase
1.Fast feedbackis essential (seconds / minute timescale) to reduce the time to
complete the experiment, improve data quality, and increase the success rate
2.24/7 availability
3.Short burstjobs, needing very short startuptime
Very disruptive for computers that typically host simulations that run for days
4.Storagerepresents significant fraction of the overall system, both in cost and
complexity:
1.1 Pbyte/day fast local buffer, 5-10 Pbytesstorage for local processing, 10 yr
long term offline tape storage
2.60-300 Teraflops in 2020, grow to 1 Pflopin 2022+
5.Throughputbetween storage and processing is critical
Currently most LCLS jobs are I/O limited
6.Speed and flexibility of the development cycleis critical
Wide variety of experiments, with rapid turnaround, and the need to tune data
analysis during experiments
These aspects are instrumental also for other SLAC facilities
(egCryoEM, UED, SSRL, FACET-2)
9
LCLS-II data flow: from data production, to online
reduction, real-time analysis, and offline interpretation
Currently seeking user input on acceptable solutions for data reduction & analysis
Megapixel
detector
X-ray diffraction
image
Intensity map from
multiple pulses
Interpretation of system
structure / dynamics
Critical Requirement for Offsite Resources for LCLS Computing
MIRA
at Argonne
TITAN
at Oak
Ridge
CORI
at NERSC
●Several experiments require access to leading
edge computers for detailed data analysis.
This has its own challenges, in particular:
•Need to transfer huge amounts of
compressed data from SLAC to
supercomputers at other sites
•Providing near real-time
results/feedback to the experiment
•Has to be reliable, production quality
The requirements will need long term agreement between BES and ASCR 10
11
Requirement
•Today 20 Gbpsfrom SLAC to NERSC =>70Gbps 2019
•Experiments increase efficiency & networking
•2020 LCLS-II online, data rate 120Hz=>1MHz
•LCLS-II starts taking data at
increased data rate
•2024 1Tbps:
•Imaging detectors get faster
Moore’s
law
Offsite Data Transfer: Needs and Plans
12
LCLS-II needs are compatible with SLAC and NERSC plans
SLAC plans
LCLS-II
LCLS-I
NERSC plans
ESnet6
upgrade
13
Data Transfer testing
•Zettar/zx:
•Provide HPC data transfer solution (i.e. SW + transfer system
reference design):
-state of the art, efficient, scalable high speed data transfer
•Over carefully selected demonstration hardware
•Today using existing equipment
•DTNs at NERSC & SLAC
•With Lustrefile systems at ends
•Today 100Gbps link
•Using widely used bbcpand xrootdtools
•Currently ~55Gbps, exploring limitations
•Upgrading border to 2*100Gbps in progress
14
Data transfer performance: Test bed: two clusters at
SLAC with 5000 mile link
Space efficient: 6U per cluster
Energy efficient,
<80Gbps
15
NG demonstration
100Gbps
100Gbps
Storage servers
> 25TBytes in 8 SSDs
Data Transfer Nodes (DTNs)
IP over InfiniBand
IPoIB)
4* 56 Gbps
Other cluster
or High speed
Internet
n(2)*100GbE
4* 2 * 25GbE
2x100G
LAG
16
Memory to Memory between clusters with 2*100Gbps
•No storage involved just DTN to DTN mem-to-mem
•Extended locally to 200Gbps
•Here repeated 3 times
•Note uniformity of 8* 25Gbps interfaces.
•Can simply use TCP, no need for exotic proprietary protocols
•Network is not a problem
17
Storage
•On the other hand, file-to-file transfers are at the mercy of
the back-end storage performance.
•Even with generous compute power and network
bandwidth available, the best designed and implemented
data transfer software cannot create any magic with a
slow storage backend
18
XFS READ performance of 8*SSDs in a file server
measured by Unix fioutility
20GBps
15GBps
10GBps
5GBps
0GBps
15:1615:18
Read Throughput
15:18
SSD busy
800%
600%
400%
200%
15:16 15:1615:18
Queue Size
3500
2500
1500
Data size = 5*200GiB files similar to
typical LCLS large file sizes
Note reading SSD busy, uniformity, plenty of objects in queue
yields close to raw throughput available
19
XFS + parallel file system WRITE performance for 16
SSDs in 2 file servers
SSD busy
10GBps
SSD write throughput
50
Queue size of
pending writes
Write much slower than read
File system layers can’t keep queue full (factor
1000 less items queued than for reads)
20
Conclusion
Network is fine, can drive 200Gbps, no need for proprietary
protocols
Insufficient IOPSfor write < 50% of raw capability
-Today limited to 80-90Gbps file transfer
Work with local vendors
•State of art components fail, need fast replacements
•Worst case waited 2 months for parts
Use fastest SSDs for write
•We used Intel DC P3700 NVMe1.6TB drives
•Biggest also fastest but also most expensive
•1.6TB $1677 vs 2.0TB $2655 ; 20% improvement 60% cost
increase
Parallel file system is bottleneck
•Needs enhancing for modern hardware & OS
21
Demonstration PetaBytein < 1.5days at ~70Gbps on
<80Gbps shared link
1/3rd of ESnet's
traffic
22
Impact on all ESnettraffic
12pm 3pm 6pm 9pm Mon 103am 6am 9am
100G
200G
OSCARS LHCONE Other
When running data transfer contributes ~ 1/3 of total ESnettraffic
23
Summary
What is special:
•Scalable. Add more NICs, more DTNs, more storage servers,
links as needed/available…
•Power, & space efficient; low cost
•HA tolerant to loss of components
•Storage tieringfriendly
•Reference designs
•Easy to use, production-ready software
Proposed Future PetaByteClub
A member of the Petabyte ClubMUSTbe an organization that is
capable of using ashared productionpoint-to-point WAN link to
attain aproductiondata transfer rate >=150PiB-mile/hour
24
Future
Upgrade 200Gbps border at SLAC to 2x100Gbps to Esnet –Spring 2018
Then onto SLAC to NERSC
•Very special environment, hard to modify
•Focus has been different to what we need:
•vector computing and communication-intensive problems at expense of faster cpus
-but LCLS embarassinglyparallel
•Not easy to use for cluster computing such as xrootdor zx
•Unless Cray supports porting application -unlikely
•Using the Burst Buffer (BB) is not easy from an application
•BB Will probably limit data transfers performance
•Normally used from batch so challenge for near-realtime
•Can bypass the NERSC DTNs and go directly to the Cori nodes which have direct
access to the BB.
•However DTNs provided common ways to customize, more agile
•Harder to do for Cori nodes
•Complex supercomputer,
•components tend to be behind very latest technology curve
Looking to exceed PB/day by SC2018