1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in...

23
1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE 2006 Workshop
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    214
  • download

    0

Transcript of 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in...

Page 1: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

1

Dr. Frederica DaremaSenior Science and Technology Advisor

NSF

Research and Technology Advancesin Systems Software

for Emerging Computer Systems

EDGE 2006 Workshop

Page 2: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

2

Outline

• The BIG PICTURE• Applications Directions• Computing Platforms Directions• Research and Technology Directions• Examples of some advances• Future Challenges and Opportunities

Page 3: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

Science, Engineering, and “Commercial” Applications

Environments: how are they shaping in the future

What does it entail for:Multicore Processors

and… for Computing in the Larger-Scale

Page 4: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

4

– Mostly monolithic– Mostly one

programming language

– Multi-Modular– Multi-Language– Multi-

Developers– Multi-Source

Data

Present / Future

– Computation Intensive

– Batch– Hours/days

– Computation Intensive– Data Intensive– Real Time– Few Minutes/hours– Visualization – Interactive Steering– Integrated Simulations&Experiments

Dynamic Data Driven Applications Systems

Past

Applications Directions

Page 5: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

5

Platforms Directions

Distributed Platform

MPP NOW

SAR

tac-com

database

firecntl

firecntl

alg accelerator

database

SP

….

– Vector Processors– SIMD MPPs– Distributed Memory MPs– Shared Memory MPs

• Latencies– variable (internode,

intranode)• Bandwidths

– different for different links

– different based on traffic

– Distributed Platforms, Heterogeneous Computers and Networks

• Heterogeneity– architecture (computer &network)– node power

(supernodes, MCP)

Past

Prese

nt/Futu

re

Petaflops Platform(Grid-in-a-Box)

Page 6: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

6

EXAMPLE OF EMERGING DIRECTIONS:

Dynamic Data Driven Application Systems(DDDAS)

New Direction for applications/simulations

andmeasurement methodology

Multi-agency DDDAS program – NSF, NIH, NOAAwith cooperation with the EU/IST & e-Sciences Programs

(www.cise.nsf.gov/dddas)

Page 7: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

7

Measurements ExperimentsField-Data

User

Theory

(First P

rincip

les)

Simula

tions

(Math

.Modeli

ng

Phenomenol

ogy)Experiment

MeasurementsField-Data

(on-line/archival)User

Theory

(First P

rincip

les)

Simula

tions

(Math

.Modelin

g

Phenomenolo

gy

Observ

ation M

odeling

Design)

OLD

(serialized and static)

NEW PARADIGM

(Dynamic Data-Driven Simulation Systems)

Challenges:Application Simulations DevelopmentAlgorithms Measurement Instruments InterfacesComputing Systems Support

Dynam

ic

Feed

back

& C

ontro

l

Loop

What is DDDAS(Symbiotic Measurement&Simulation Systems)

Page 8: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

8

Beyond Grid Computing “Extended Grid’:

the Application Platform is

the computational&measurement system

Applications

Com

puta

tion

al

Plat

form

s

Inst

rum

ents

Sens

ors

Archi

val/

Stor

ed D

ata

Measurements Computational Grids

Page 9: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

9

Experimental DynamicObservations

UsersADaM ADAS Tools

NWS National Static Observations & Grids

MesoscaleWeather

Local Observations

Local Physical Resources

Remote Physical (Grid) Resources

Virtual/Digital Resources and Services

LEAD: Users INTERACTING with Weather

Interaction Level II: Tools and People Driving Observing Systems – Dynamic Adaptation

Page 10: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

10

Page 11: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

11

Examples of Computational Req’s - examples from DDDAS applications -

* often results needed in Real-Time or near-RT *• Water Pollution/Contaminant Transport/Detection:

Today’s problem: 500nodes- 4.4Pflops;1.2GBmem;.02GB/s -> Large/Projected problem: 10,000nodes-212Pflops; 10.2GBmem; .9GB/sec)

• Chemical Pollution/Contaminant Transport/DetectionToday’s problem: 2000nodes(Lemieux); 4TBmem ; 5hrs -> Large/Projected problem: 10Knodes; 20TBmem; 1hr

results in Real-Time (or near RT): 50-100Knodes

• Protein FoldingToday’s problem: 1024nodes(IBM-BlueGeneL); 6/7days for 1 protein

(w 150aminoacids)• ElectricPowerGrid

Today’s problem: 100Gflops; 50MBmem• Aircraft modeling

Today’s problem: Full FEM&CFD: 384,000cpu-hrs; 320GBmemROM: 72secs; 78KB

• Fire Propagation:Today’s problem: FireModel: 100procs(BG, Teragrid clusters); 30GBmem; 1hr-5hrs

Coupled Weather/Fire: 100-1000nodes; 200-400GBmem

Page 12: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

12

So… • Processing at multiple levels• Computation and data processing both: at the

application and the instruments/sensors side • Multicores in high-end platforms, workstations,

visualization servers, data servers, etc, … • Multicores at application side• Multicores at the data collection side

…. MULTICORES EVERYWHERE!!!

Page 13: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

13

Some Challenges

• Programmability– models of concurrency, multiple heterogeneous models

• optimized performance – application– system

• scalability across multiple levels– application algorithms – systems software

• fault tolerance, recovery, reliability, security• power management• verification, validation• …..

These challenges have been articulated for years on the past and present platforms Multicores add to the complexity of all the above

Page 14: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

14

The need for a holistic approach

• Large-Scale Systems does not entail only “flops” (Giga-, Tera-, Peta-, Zetta-,…)

• Large-scale “parallel” systems are the POWERFUL nodes/platforms - in balance with other resources in the system

• Analogy: the “stars” and the “galaxy” within the “cosmos”

• Methods andTools needed at all levels, and they need to work together synergistically

Page 15: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

15

Perform

ance

Engineering

Dynamic

Compilers

&

Application

Composition

Dynamic Data-Driven

Applicatio

n Systems

--

Symbiotic

Measurement&Simulatio

n

Systems

Large-Scale Systems(e.g. Enabling DDDAS)

Systems Software

(NGS: 1998-2004)

(CSR/AES&SMA: 2004-todate)

MultidisciplinaryResearch

Applications

Modeling

& Measurements

CSResearch

Page 16: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

16

System Modeling and Analysis (SMA)(a component of the Computer Systems Research

Program)(CSR Program)

Develop methods and tools for modeling, measuring, analyzing, evaluating, and predicting the performance

and dependability of complex computing and communications systems taking a “system level view”

Topics of Interest• Hardware and Software modeling

– methods tools and measurements, providing multimodal, hierarchical or multilevel modeling and analysis capabilities of such systems;

– methods that describe components of the system, but also the system as a total, and enable assessment of the effects of individual hardware and software layers and components of these systems;

– ability to describe the system in multiple levels of detail (characteristics and time-scales);

– combine different methods of describing components and layers

Page 17: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

17

System Modeling and Analysis (SMA)

Topics of Interest (cont’d)• Novel modeling and measurement approaches

– Develop capabilities to describe, analyze and predict the behavior of the components as well as the systems; Analysis and prediction due to changes in the application, system software, hardware; multilevel approaches and multi-modal approaches

• Performance Frameworks – combine tools in “plug-and-play” fashion – multiple views of the system

Page 18: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

18

Multiple views of the systemThe applications’ view

Authenication/ Authorization

DependabilityServices

Distributed Systems ManagementDistributed, Heterogeneous, Dynamic, Adaptive

Computing Platforms and Networks

DeviceTechnology . . .CPU

Technology

VisualizationScalable I/O

Data ManagementArchiving/Retrieval

ServicesOther Services . . .

Collaboration Environments

Distributed Applications

MemoryTechnology

Application

Models

OSScheduler

ModelsArchitecture /

Network Models

MemoryModels

IO / FileModels

. . . Languages

LibrariesTools

Compilers

Page 19: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

19

Advanced Execution Systems (AES) (a component of the Computer Systems Research

Program)(CSR Program)

Seeks to create systems software to facilitate the development and runtime support of complex applications executing on large,

heterogeneous high-end computing and grid platforms AES emphasizes runtime compiling systems and application

composition systems interface with the underlying operating systems services and incorporating systems modeling and analysis methods

and tools. Topics of Interest• Novel Compiler Technology that go beyond the standard static notion of a

compiler – for example by embedding a portion of the compiler in the runtime and

endowing the system with resource awareness and adaptive mapping capabilities; 

– new compiler techniques for determining functional and data dependencies across multiple levels of memory hierarchy and across platforms;

– mechanisms for matching an application’s resource needs to underlying resources when both are changing as the application executes

Page 20: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

20

Advanced Execution Systems (AES)

Topics of Interest• Programming models and tools

– expressing application partitioning across distributed, heterogeneous computing platforms; application-level checkpointing and recovery

• Application composition system (ACS) technology– constructing applications to fit the available resources and to

adapt to changes in the underlying execution environment; – methods for automatically selecting application components; – creating knowledge bases for application components;

interfacing with the underlying computing platform models to determine suitable application components;

– and developing appropriate application component libraries and interfaces so the run-time portion of the RCS can link to such libraries.

Page 21: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

21

DynamicallyLink

&Execute

The AES component develops technology for integrated feedback & control Runtime Compiling System (RCS) and Dynamic Application

CompositionApplication

Model

Application Program

ApplicationIntermediate

Representation

CompilerFront-End

CompilerBack-End Performance

Measuremetns&

Models

DistributedProgramming

Model

ApplicationComponents

&Frameworks

Dynamic AnalysisSituation

LaunchApplication (s)

Distributed Platform

Ada

ptab

leco

mpu

ting

Syst

ems

Infr

astr

uctu

re

Distributed Computing Resources

MPP NOW

SAR

tac-com

database

firecntl

firecntl

alg accelerator

database

SP

….

Page 22: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

22

Examples of areas funded• Programming Models, Languages, Environments

– “legacy models” (MPI), to high-level, domain, hierarchical multithreading, software component libraries, dynamic workflow, streaming environments (languages/compilers),

• Compiler methods and tools– program analysis methods – program transformation methods– program Phase detection – dynamic detection– combine: static, dynamic, and feedback methods; Continuous optimization methods– scheduling, scalability across hierarchies – checkpoint & recovery (system level, application level)

• Real-Time systems and integration (with server, high-end, etc… environments)

• Systems management including power-management – optimization & constraints ( performance&power optimization)

• Validation, Verification, Testing• System Modeling and Analysis

– Modeling of applications, algorithms, platforms (at all levels)– performance, dependability (performability), reliability– multi-modal modeling, power modeling (at all levels : application, computational

platforms, processor/multicores), Performance specification (languages, compilers); performance frameworks

– Fast real-time or near-real-time simulation methods

Have seen an increase in all these areas with respect to MULTICORES

Page 23: 1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Research and Technology Advances in Systems Software for Emerging Computer Systems EDGE.

23

Summary Thoughts• MultiCoreProcessors provide an opportunity for: enhanced

capabilities in computation, communication and data management• Multicores present the promise of populating all levels of

computational platforms and environments• They should be viewed in the presence of other resources’

heterogeneity, dynamicity, adaptivity• Multicores cannot exist in isolation – they will be “nodes” in other

systems, high-end platforms, servers, real-time systems, instruments, and grids (InformationPowerGrid, TeraGrid)

• Complexity of applications and platforms presents a significant opportunity for innovative research and technology in systems software (methods & tools)

• Multicores will resurrect and build upon ideas/methods started in 80’s shared memory “parallel processing” and the recent advances for distributed systems

• Need to advance the technologies that will automate the mapping of such complex and dynamic applications on complex platforms with multiple and heterogeneous levels of processors, memory, and networks

• An important item: do we nurture a critical mass of people that will work on these challenges?(where are the compiler people to address/contribute to these challenges?!!!)

{ I personally hope that the opportunities of MultiCoreProcessors will attract the attention and the people needed}