Paul Watson Newcastle University, UK [email protected] Cloud Computing for e-Science.

30
Paul Watson Newcastle University, UK [email protected] Cloud Computing for e- Science

Transcript of Paul Watson Newcastle University, UK [email protected] Cloud Computing for e-Science.

Page 1: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Paul WatsonNewcastle University, UK

[email protected]

Cloud Computing for e-Science

Page 2: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Current Problems (slide by permission of Arjuna Technologies)

Extinctdemand

Over-provision

Dynamic Business Demand

Under-provision

Time

ResourcesDemand

CapacityResources

Time

Demand

Capacity

Static IT Supply

Newdemand

Silos =Inflexibility

2

Application SilosCapacity PlanningCapital Expenditure

Page 3: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

• e.g. Amazon

3

Page 4: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Cloud Computing - Plan

• What is Cloud Computing?• Why Cloud Computing?• e-Science Central• Cloud Issues

Page 5: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

What is Cloud Computing?

“.. a broad array of

web-based services aimed at

allowing users to obtain a wide range of functional capabilities

on a ‘pay-as-you-go’ basis

that previously required tremendous hardware/software investments

and professional skills to acquire.”

Irving Wladawsky-Berger

Chairman Emeritus, IBM Academy of Technology

Page 6: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

What’s New?

• illusion of Infinite computing resources On Demand

• no up-front commitment by users

• Pay for use of resources on a short-term basis as needed

(from “Above the Clouds: A Berkeley View of Cloud Computing”)

Page 7: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Example – Amazon Web Services

• Based on Xen VMs– run any OS & software stack

• CPU: 1.0Ghz x86 instance @ $0.10 /hour• Blob Storage @ $0.12 /GB month• External Data Transfer @ $0.10 /GB

• Also queue, key store, block store, range of instances

Page 8: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Cloud Services Continuum (based on Robert Anderson)

Platform(PaaS)

Infrastructure(IaaS)

Software(SaaS)Google Docs

Google AppEngine

Amazon EC2 & S3

http://et.cairene.net/2008/07/03/cloud-services-continuum/

Windows Azure .net services

Salesforce.com

Com

plexity & F

lexibility

e-Science Central

Amazon-Elastic Map Reduce-Simple DB-Simple Queue Service

Windows Azure- Sharepoint- SQL Services

Page 9: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

CARMEN Project – Science Cloud Example

Stirling

St. Andrews

Newcastle

York

Sheffield

Cambridge

ImperialPlymouth

Warwick

Leicester

Manchester

UK EPSRC e-Science Pilot

€4.5M (2006-10)

20 Investigators

Page 10: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Research Challenge

Understanding the brain is the greatest informatics challenge

• Enormous implications for science:

• Biology

• Medicine

• Computer Science

Page 11: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Epilepsy Exemplar

Data analysis guides surgeon during operation

Further analysis provides evidence

WARNING!The next 2 Slides show an exposed human brain

Page 12: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.
Page 13: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.
Page 14: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

CARMEN e-Science Requirements

• Store– very large quantities of data (100TB+)

• Analyse– suite of neuroinformatics services– support data intensive analysis

• Automate– workflow

• Share– under user-control

Page 15: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Background: North East Regional e-Science Centre

• 25 Research Projects across many domains:• Bioinformatics, Ageing & Health, Neuroscience, Chemical

Engineering, Transport, Geomatics, Video Archives, Artistic Performance Analysis, Computer Performance Analysis,....

• Same key needs: e-ScienceCentral

Page 16: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

e-Science Central

e-ScienceCentrale-ScienceCentral

• Dynamic Resource Allocation• Pay-as-you-Go*

•Web based•Works anywhere•Web based•Works anywhere

• Controlled Sharing• Collaboration• Communities

• Controlled Sharing• Collaboration• Communities

Page 17: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Science Cloud Architecture

Data storage

and

analysis

Access over Internet

(typically via browser)

Access over Internet

(typically via browser)

Upload data &

services

Upload data &

services

Run analyses

Run analyses

Page 18: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

AppApp ....

Workflow Enactment

Social Networking

App API

Security

Processing

e-Science Central

Storage

AppApp

Analysis Services Science Cloud

Platform

Cloud Infrastructure

Page 19: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Editing and Running a Workflow in Browser

Page 20: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Viewing the output of a Workflow

Workflow

Result File

Page 21: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Viewing results

Page 22: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Blogs and links

Communicating Results

Linking to results & workflows

Page 23: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Data Provenance

Page 24: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Microsoft Azure Cloud for e-Science Demo

• Ongoing Experiments with Microsoft Azure Cloud– running Chemical analyses

Thanks to:

- Paul Appleby & Team at the Microsoft Technology Centre, Reading

- & MS External Research e-Science Group

Page 25: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Microsoft Azure Cloud Demo

Page 26: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

When Clouds may not be appropriate

• Large data transfers

–time & cost

• “Traditional” High Performance Computing

– cpu/io/network bandwidth/low latency

• Confidentiality

• High Availability

Page 27: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Dept A Dept B

Service Agreemen

t

App1 App1 & 2

Public Cloud

Private Clouds

27

Arjuna Agility

Private Cloud

Page 28: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Dept A Dept B

Service Agreemen

t

App1 App1 & 2

Service

Agreement

Public Cloud

Federating Private & Public Clouds

28

App1

Arjuna Agility

Public Cloude.g. Amazon

Private Cloud

Page 29: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Dept A Dept B

App1 App1 & 2

Public Cloude.g. Amazon

App1

Public Cloude.g. FlexiScale

App1

29

Arjuna

Private Cloud

Arjuna Agility

Page 30: Paul Watson Newcastle University, UK paul.watson@ncl.ac.uk Cloud Computing for e-Science.

Summary

• Cloud computing creates new opportunities– capital expenditure → operational expenditure– handling dynamic changes in demand– but not appropriate in all cases– federation the future?

• Clouds can revolutionise e-science– reduce time from idea to realisation– exploring with e-Science Central (demo available)

• We shouldn’t underestimate complexity– building scalable distributed systems is still hard– cloud platforms important in reducing the complexity