Snapshot of DAQ challenges for Diamond Martin Walsh.
-
Upload
garey-moore -
Category
Documents
-
view
215 -
download
0
Transcript of Snapshot of DAQ challenges for Diamond Martin Walsh.
Snapshot of DAQ challenges for Diamond
Martin Walsh
Your role in all this...- SAC is the advisory body for Diamond: how do we
make the most of your collective knowledge, experience and wisdom ?
- Providing the information you need
- Organisation and content of meetings
- The most effective forum for discussion
- Efficient transmission of your advice to us
- Informing you how Diamond has acted on your advice – and what the result has been
Harwell Campus
ISISCLF
RAL Space
Mary Lyon centre mouse functional genomics
International Space Innovation Centre (ISIC)
MRC Harwell
RALRutherford Appleton Laboratory
Public Health England
The European Centre for Space Applications and Telecommunications (ECSAT)
Research Complex
Beamlines by Village
Macromolecular Crystallography
Soft Condensed Matter
Spectroscopy
Materials
Engineering and Environment
Surfaces and Interfaces
eBIC
< 100GB/day< 1TB/day> 1TB/day
Per Beamline Data Rates
Tomography Beamlines have collected nearly 2PB of data, more than the rest of Diamond beamlines put together.
New Arrival EM2 Microscopes @ 5TB/day XFEL from
2017
Some Numbers for 2014-15
• Total number of user proposals: 642, delivered shifts 7964 (1 shift =8 hours)
• Total number of users 7,696 – (4,988 on site + 2708 remote)– MX remote use now exceeds 50% of use
• Total number of Unique PhD’s 857• Total Journal papers published 3,883 (677
published in 2014)
Increasing resolution
CryoET
Single particle cryo EM
X-ray crystallography
B21 X-ray Solution Scattering
Cryo-electron tomography
One major player – integrated Structural biology Increasing biological complexity and integrity
Fluorescence microscopy (B24/CLF)
B24 X-ray microscopy Cellular cryo-electron tomography
B22 - Infraredmicrospectroscopy
Cell/tissue
Solution
CrystallineElectron
Microscopy
XFEL
Life Science & DLS• B22 Infrared• B24 Cryo X-ray microscopy• I18, I20, B18, I14 X-ray
spectroscopy
• I22/B21 SAXS• B23 CD• Spectroscopy
MX village• ( I02, I03, I04)• (I24, I04-1)• (I23, VMX)
National facility for EM in life & Physical Sciences
UK Hub for XFEL sample and software developments
• I08 X-ray STXM• I13 X-ray tomography
& coherent diffraction
Cell BiologyOPPF-UK
MPLRC@H
Diamond Beamlines:Macromolecular Crystallography, Scattering, X-ray
spectroscopyISIS beamlines:
SANSNeutron Reflection (NR)
Computational environment /
CCP4, CCP-EM
HPC
Synchrotron Imaging
UK XFEL Hub@Diamond
Cryo-EM/ETElectron Bio-
Imaging Centre (eBIC)
An integrated Approach to Structural Biology
Fluorescence microscopy
(CLF(STFC & DLS)
Diamond Data Rates/Volumes History
• Early 2007:– Diamond first user.– No detector faster than ~10 MB/sec.
• Early 2009:– first Lustre system (DDN S2A9900)– first Pilatus 6M system @ 60 MB/s.
• Early 2011:– second Lustre system (DDN SFA10K)– first 25Hz Pilatus 6M system @150 MB/s.
• Early 2013:– first GPFS system (DDN SFA12K)– First 100 Hz Pilatus 6M system @ 600 MB/sec– ~10 beamlines with 10 GbE detectors (mainly Pilatus and PCO Edge).
• Early 2016:– delivery of Eiger 16M for MX (initially at 6.75 GB/s, potential 13.5 GB/s)
2007 2009 2011 2013 201510
100
1000
10000
Peak Detector Per-formance (MB/s)
Doubling time = 7.5
months
Challenges
• Hardware life cycles are fast, and hardware problems can be solved with sufficient money.– So detector data rates are not the problem.
• Software life cycles are slow – our analysis routines have a clear lineage often dating back 40 years.– Software is a problem
• Synchrotron have to support a diverse range of techniques.– Systems and skills developed for one beamline are not appropriate for all
beamlines.– Need to be able to attract talented software scientists AND software
engineers• Remote access to large scale facilities such as synchrotrons, XFELs,
national facilities (e.g. Electron microscopy, HPC) – Need for dedicated light paths between these facilities to deal with data
volumes generated
Use CASE:Structural Biology
Numbers:– Raw Data Macromolecular crystallography
• Currently 0.5 - 1 TB/day/beamline @ DIAMOND ( 3-6TB/day)• 2016 – detector technology will enable easily X10 increase in data. Upgrades to beamlines will enable
better exploitation (hardware can currently produce >100TB/day if samples available. • Near future expect to produce 5PB MX data/year –this is at DIAMOND ALONE!• Including SR MX beamlines over Europe expect to reach/exceed (25PB/year) • European XFEL – SFX beamline at full operations potential to generate 300 TB/day
– Raw Data Electron microscopy
• Electron Microscopy currently at 1TB/day/microscope – high resolution experiments to start in November ‘16 which will produce >5TB/day/microscope
• High resolution EM work from Jan 2016 will generate >10TB/day of data
– Data reduction and analysis
• Requirement for Light paths to be established between large scale and national facilities – SR, CryoEM, Data Centers etc
• UK will have dedicated light path from European XFEL SFX beamline to DIAMOND – plans needed to extend ...
• Currently large investment in software for data analysis is required to exploit developments in parallelized systems /new HPC storage
The future
• A lot of software will need to be redeveloped:– Incorporate modern paradigms like map-reduce.– May include middle layers processing that runs close to
distributed data chunks.– Intermediate data will be cached between processing
steps.• Synchrotrons/Structural biology infrastructure will
become turnkey sites.– Users may not come to site/facility– Results will be in the form of processed, not raw data.– There must be trust between the site and the user,
backed up by data provenance and full metadata.– High speed light links between centers required.
Overview
- First Impressions
- Science highlights
- Technical developments
- Industrial engagement
- Plans for the future
- Finance
Thanks for your attention
Example data access rates
Example of 12GB/s…typically at 3-4GB/s
Tomography rates
MX/EM data storage 2015