NOAO Brown Bag Tucson, AZ March 11, 2008 Jeff Kantor LSST Corporation Requirements Flowdown with...
-
Upload
ashlee-henry -
Category
Documents
-
view
217 -
download
0
Transcript of NOAO Brown Bag Tucson, AZ March 11, 2008 Jeff Kantor LSST Corporation Requirements Flowdown with...
NOAO Brown BagTucson, AZ
March 11, 2008
Jeff KantorLSST Corporation
Requirements Flowdown with LSST
SysML and UML Models
NOAO Brown Bag
March 11, 2008 Tucson, AZ
2
Presentation Outline
• LSST Data Management introduction• Requirements flow-down• Enterprise Architect SysML/UML demonstration
NOAO Brown Bag
March 11, 2008 Tucson, AZ
3
Data Management is a distributed system that leverages world-class facilities and cyber-infrastructure
Long-Haul CommunicationsChile - U.S. & w/in U.S.
2.5 Gbps avg, 10 Gbps peak
Archive Center
NCSA, Champaign, IL
100 to 250 TFLOPS, 75 PB
Data Access CentersU.S. (2) and Chile (1)45 TFLOPS, 87 PB
Mountain Summit/Base FacilityCerro Pachon, La Serena, Chile
25 TFLOPS, 150 TB
1 TFLOPS = 10^12 floating point operations/second
1 PB = 2^50 bytes or ~10^15 bytes
NOAO Brown Bag
March 11, 2008 Tucson, AZ
4
LSST Data Management provides a unique national resource for research & education
• Astronomy and astrophysics– Scale and depth of LSST database is unprecedented in astronomy
• Provides calibrated databases for frontier science• Breaks new ground with combination of depth, width,
epochs/field• Enables science that cannot be anticipated today
• Cyber-infrastructure and computer science– Requires multi-disciplinary approach to solving challenges
• Massively parallel image data processing• Peta-scale data ingest and data access• Efficient scientific and quality analysis of peta-scale data
NOAO Brown Bag
March 11, 2008 Tucson, AZ
5
DM system complexity exists but overall is tractable
• Complexities we have to deal with in DM– Very high data volumes (transfer, ingest, and especially query)– Advances in scale of algorithms for photometry, astrometry, PSF
estimation, moving object detection, shape measurement of faint galaxies– Provenance recording and reprocessing– Evolution of algorithms and technology
• Complexities we DON’T have to deal with in DM– Tens of thousands of simultaneous users (e.g. online stores)– Fusion of remote sensing data from many sources (e.g. earthquake
prediction systems)– Millisecond or faster time constraints (e.g. flight control systems)– Very deeply nested multi-level transactions (e.g. banking OLTP systems)– Severe operating environment-driven hardware limitations (e.g. space-
borne instruments)– Processing that is highly coupled across entire data set with large amount
of inter-process communication (e.g. geophysics 3D Kirchhoff migration)
NOAO Brown Bag
March 11, 2008 Tucson, AZ
6
Performance - Nightly processing timeline for a visit meets alert latency requirement
Exposure 1
Exposure 2
Shutterclose
Time(sec)
Readoutcomplete
Transfer to Basecomplete
Image Processing/ Detection complete
Associationcomplete
Alert generatecomplete
Shutterclose
Readoutcomplete
Transfer to Basecomplete
Image Processing/Detection complete
2s 6s 20s
Exposurebegins
15s
T0 - Start of 60 second latency timer
3s6s 20s 10s 10s
T0 + 51s
2s
Exposurebegins
15s
NOAO Brown Bag
March 11, 2008 Tucson, AZ
7
ArchiveCenter
Base
Data AccessCenter Archive Center
Trend Line
Computing needs show moderate growth
NOAO Brown Bag
March 11, 2008 Tucson, AZ
8
Database Size (data+indexes, cumulative)
0
2
4
6
8
10
12
14
16
2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024
year
PB
Database Volumes
• Detailed spreadsheet-based analysis done• Expecting:
– 6 petabytes of data, 14 petabytes data+indexes
– all tables: ~16 trillion rows (16x1012)
– largest table: 3 trillion rows (3x1012)
Database Size (data only, per table, cumulative)
0
1
2
3
4
5
6
2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024year
PB
DIASource-stars
DIASource-galaxies
Source-stars
Source-galaxies
VarObj-stars
VarObj-galaxies
Objects-stars
Objects-galaxies
NOAO Brown Bag
March 11, 2008 Tucson, AZ
9
CerroPachonLa
Serena
Long-haul communications are feasible
• Over 2 terabytes/second dark
fiber capacity available
• Only new fiber is Cerro Pachon
to La Serena (~100 km)
• 2.4 gigabits/second needed from
La Serena to Champaign, IL
• Quotes from carriers include 10
gigabit/second burst for failure
recovery
• Specified availability is 98%
• Clear channel, protected circuits
NOAO Brown Bag
March 11, 2008 Tucson, AZ
10
ScienceRequirements
Document
Telescope, Camera, Survey
Reference Designs
Data Management
Requirements
Data ManagementDesign
Complete, traceable flow-down from scienceto system, to data management subsystem
Allocation
Traceability
DMS Sizing Models:
Processing, Storage,
Communications
Allocation
Traceability
NOAO Brown Bag
March 11, 2008 Tucson, AZ
11
ScienceRequirements
Document
SystemRequirements,
Telescope, Camera,Survey
Reference Designs
Key Specified and Derived Requirements
• The mission, instrument design, observing cadence and observatory operational requirements drive the DM requirements Data
ManagementRequirements
Data ManagementDesign
Data Products• Images• Catalogs• Alerts• Quality and Performance
StatisticsAlgorithms/Pipelines• Astrometric/Photometric
Calibration• Source Detection• Source - Object Association• Moving Object
Detection/Orbit Matching• Alert Processing• Deep Detection• Calibration• ClassificationArchitectural• Scalabiity• Reliability/Availability• Evolution
Science and system requirements flow-down using SysML
Allocation
Traceability
System Modeling Language
(SysML)
NOAO Brown Bag
March 11, 2008 Tucson, AZ
12
DMS Sizing ModelsComputational Requirements Processing
• Sustained & peak processing analyzed• Tradeoffs considered:
• Store vs. recompute• Types of parallelism• Reliability vs cost
Storage• Sustained & peak I/O rates and storage needs analyzed• Tradeoffs considered:
• Store vs. recompute• DBMS vs File System• Multi-dimensional access• Reliability vs cost
Communications• Sustained & peak bandwidth analyzed• Tradeoffs considered:
• Transfer and process vs process and transfer• Media transfer vs network• Reliability vs cost
Data ManagementRequirements
Data Management
Design
Storage and Input/Output Requirements
Data Transfer,Replication, And Access
Requirements
Performance requirements analyzed & feasible
NOAO Brown Bag
March 11, 2008 Tucson, AZ
13
Systems Engineering Model for Requirements Flowdown, Traceability & Configuration Control
SysML = System Modeling Language
SE Model
Operational Model
Structural/Component Model
Requirements Model
Other Engineering Analysis Models
Performance & Constraints Model
FEMAPNX Nastran
NOAO Brown Bag
March 11, 2008 Tucson, AZ
14
Requirements Flow
SRD
TelescopeSite Req.
CameraReq.
DataManagement
Req.
Functional Req.(FPRD)
Operational Req.(OCDD)
InterfaceRequirements
OutsideConstraints
System Requirements
SysMLModel
DMSubsystems
Req.
UML Model
T&SSubsystems
Req.
SysML Model
CameraSubsystems
Req.
LSST Board&
Science Council
Project OfficeChange Control
Board
Project OfficeChange Control
Board
SubsystemGroup
NOAO Brown Bag
March 11, 2008 Tucson, AZ
15
Requirements Hierarchy
NOAO Brown Bag
March 11, 2008 Tucson, AZ
16
Rigorous process for software engineeringbased on wide industry experience (Iconix)
Algorithm/Pipeline Data Product Prototypes
Image courtesy of Iconix Software Engineering, Inc. Unauthorized use not permitted.
Unified Modeling Language (UML)
NOAO Brown Bag
March 11, 2008 Tucson, AZ
17
Demo of Enterprise Architect Tool for SysML and UML
System Engineering (SysML model is in “DM SysML.pdf”)
Science RequirementsDM Functional/Performance Requirements
Use Cases
Software Engineering (UML model is in “DM UML.pdf”)
Use Cases/Robustness DiagramsClass Diagrams/Sequence Diagrams
Code