OGFCloudBOFFeb27-08

11
© 2007 Open Grid Forum Cloud Computing BOF OGF22 Birds of a Feather Session Hyatt Regency Cambridge February 27 2008 Geoffrey Fox Indiana University [email protected]

Transcript of OGFCloudBOFFeb27-08

Page 1: OGFCloudBOFFeb27-08

8/14/2019 OGFCloudBOFFeb27-08

http://slidepdf.com/reader/full/ogfcloudboffeb27-08 1/11

© 2007 Open Grid Forum

Cloud Computing BOF

OGF22 Birds of a Feather Session

Hyatt Regency Cambridge

February 27 2008

Geoffrey FoxIndiana University

[email protected]

Page 2: OGFCloudBOFFeb27-08

8/14/2019 OGFCloudBOFFeb27-08

http://slidepdf.com/reader/full/ogfcloudboffeb27-08 2/11

2© 2007 Open Grid Forum

Cloud Agenda

• Geoffrey Fox (Indiana U.) Remarks on Cloud Computing

• Martin Swany (Internet2) Clouds and Dynamic Networking

• Steven Newhouse (Microsoft) Personal View on Clouds

• Kate Keahey (Argonne, Chicago) First Steps in the Clouds

• Next Steps

Page 3: OGFCloudBOFFeb27-08

8/14/2019 OGFCloudBOFFeb27-08

http://slidepdf.com/reader/full/ogfcloudboffeb27-08 3/11

3© 2007 Open Grid Forum

What are Clouds?

• Clouds are “Virtual Clusters” (“Virtual Grids”) of 

possibly “Virtual Machines”• They may cross administrative domains or may “just be a

single cluster”; the user cannot and does not want to know

• Clouds support access (lease of) computer instances

• Instances accept data and job descriptions (code) and returnresults that are data and status flags

• Each Cloud is a “Narrow” (perhaps internallyproprietary) Grid

• When does Cloud concept work• Parameter searches, LHC style data analysis ..• Common case (most likely success case for clouds) versus

corner case?

• Clouds can be built from Grids

• Grids can be built from Clouds

Page 4: OGFCloudBOFFeb27-08

8/14/2019 OGFCloudBOFFeb27-08

http://slidepdf.com/reader/full/ogfcloudboffeb27-08 4/11

4© 2007 Open Grid Forum

Cloud References

http://en.wikipedia.org/wiki/Cloud_computing• Includes references to Amazon, Apple, Dell, Enomalism, Globus,Google, IBM, KnowledgeTreeLive, Nature, New York Times, Zimdesk

• Others like Microsoft Windows Live Skydrive important

• http://en.wikipedia.org/wiki/Amazon_Elastic_Compute_Cloud 

• http://uc.princeton.edu/main/index.php?option=com_content&taPolicy Issues

• http://www.cra.org/ccc/home.article.bigdata.html• Hadoop (MapReduce) and “Data Intensive Computing”

• See Data intensive computing minitrack at HICSS-42 January 2009

• http://ianfoster.typepad.com/blog/2008/01/theres-grid-in.html • OGF Thought Leadership blog

• OGF22 talks by Charlie Catlett and Irving Wladawsky-Berger 

Page 5: OGFCloudBOFFeb27-08

8/14/2019 OGFCloudBOFFeb27-08

http://slidepdf.com/reader/full/ogfcloudboffeb27-08 5/11

5© 2007 Open Grid Forum

Big-Data Computing Study Group

CCC

Role

VersusOGF?

Hadoop and

MapReduceare “just”

workflow?

Page 6: OGFCloudBOFFeb27-08

8/14/2019 OGFCloudBOFFeb27-08

http://slidepdf.com/reader/full/ogfcloudboffeb27-08 6/11

6© 2007 Open Grid Forum

Google MapReduce

Simplified Data Processing on Clusters/Clouds

• http://labs.google.com/papers/mapreduce.html• This is a dataflow model between services where services can do useful

document oriented data parallel applications including reductions• The decomposition of services onto cluster engines (clouds) is automated• The large I/O requirements of datasets changes efficiency analysis in favor 

of dataflow• Services (count words in example) can obviously be extended to general

parallel applications• There are many alternatives to language expressing either dataflow and/or 

parallel operations and/or workflow

Page 7: OGFCloudBOFFeb27-08

8/14/2019 OGFCloudBOFFeb27-08

http://slidepdf.com/reader/full/ogfcloudboffeb27-08 7/11

7© 2007 Open Grid Forum

Technical Questions about Clouds I

• What is performance overhead?

• On individual CPU

• On system including data and program transfer 

• What is cost gain

• From size efficiency; “green” location (rumor that Googlehas purchased the Niagara Falls including Canada!)

• Is Cloud Security adequate: can clouds be trusted?

Can one can do parallel computing on clouds?• Looking at “capacity” not “capability” i.e. lots of modest

sized jobs

• Marine corps will use Petaflop machines – they just need

ssh and a.out

Page 8: OGFCloudBOFFeb27-08

8/14/2019 OGFCloudBOFFeb27-08

http://slidepdf.com/reader/full/ogfcloudboffeb27-08 8/11

8© 2007 Open Grid Forum

Technical Questions about Clouds II

• How is data compute affinity tackled in clouds?

• Co-locate data and compute clouds?

• Lots of optical fiber i.e. “just” move the data?

• What happens in clouds when demand for resources

exceeds capacity – is there a multi-day job inputqueue?

• Are there novel cloud scheduling issues?

• Do we want to link clouds (or ensembles as atomic

clouds); if so how and with what protocols

• Is there an intranet cloud e.g. “cloud in a box”

software to manage personal (cores on my future 128

core laptop) department or enterprise cloud?

Page 9: OGFCloudBOFFeb27-08

8/14/2019 OGFCloudBOFFeb27-08

http://slidepdf.com/reader/full/ogfcloudboffeb27-08 9/11

9© 2007 Open Grid Forum

Standards for Compute and Storage Clouds

• We no longer need interoperability of services andmessages (SOAP) but rather interoperability of clouds• Maybe each cloud so big that interoperability between

clouds not so critical

• Interoperability certainly for application specific data

and perhaps also for job specifications• WFS, GML for Geo-data; IVOA standards; DST LHC

experiment formats• JSDL, BES etc.

• Each Cloud will be proprietary but they might want rawinfrastructure standards so they can easily swap inand out different vendor’s disk drives

• Clouds very very loosely coupled; services looselycoupled

Page 10: OGFCloudBOFFeb27-08

8/14/2019 OGFCloudBOFFeb27-08

http://slidepdf.com/reader/full/ogfcloudboffeb27-08 10/11

10© 2007 Open Grid Forum

MSI Challenge Problem• There are > 330 MSI’s – Minority Serving Institutions

• 2 examples• ECSU is a small state university in North Carolina

• HBCU with 4000 students• Working on PolarGrid (Sensors in Arctic/Antarctic linked to

“TeraGrid”)

• Navajo Tech in Crown Point NM is community college withtechnology leadership for Navajo Nation• “Internet to the Hogan and Dine Grid” links Navajo communities by

wireless• Wish to integrate TeraGrid science into Navajo Nation education

curriculum

• Current Grid technology too complicated if you are not an R1institution

• Hard to deploy campus grids broadly into MSI’s

• Clouds provide virtual campus resources?

Page 11: OGFCloudBOFFeb27-08

8/14/2019 OGFCloudBOFFeb27-08

http://slidepdf.com/reader/full/ogfcloudboffeb27-08 11/11

11© 2007 Open Grid Forum

Next Steps at OGF

Clouds are just starting and build on/are related toGrids

• Clear need for best practice in use and technology

• Likely to be need for new standards and novel use of 

existing/projected standards

• New Cloud Community Group?

• Chairs, participants?• Workshop?

• OGF23 activity?

Identify key players not currently involved with OGF?