INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely...

33
An integrated IaaS and PaaS architecture for scientific computing Giacinto Donvito – INFN-Bari On behalf of INDIGO Collaboration RIA-653549

Transcript of INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely...

Page 1: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

AnintegratedIaaS andPaaS architectureforscientificcomputing

GiacintoDonvito – INFN-BariOnbehalfofINDIGOCollaboration

RIA-653549

Page 2: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

INDIGO:Theconsortium

• 26partners from11countries

• Theexagon contains thepartnersinvolved indevelopmentactivities

• 11M€ oftotal foundingbyEC

• 4Bigprivatecompanies

Integrating distributed data infrastructures with INDIGO-DataCloud

FromtheECEvaluationSummaryReport:“Theconsortiumisofexceptionalquality,andcomplementary,andwithgoodrelevantexperienceandskills”

Page 3: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

INDIGOTopicofinterest

• SincetheverybeginningweidentifiedkeyissueswithbothGridandCloudtechnologiesthatpreventedscientificcommunitiesaneasyandoptimalexploitationofdataandcomputeresources.

• WethereforedecidedtoproposethedevelopmentofasoftwareplatformcenteredaroundtwooftheEINFRA-1-2014pillars:

• Largescalevirtualizationofdata/computecenter resources.ThisbecamethefocusofINDIGOWP4.

• Developmentandadoptionofastandards-basedcomputingplatform(withanopensoftwarestack).ThisbecamethefocusofINDIGOWP5.

Integrating distributed data infrastructures with INDIGO-DataCloud 3

Page 4: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

Howdoweseedistributedcomputinginthefuture

1. Easeofaccessanduseforsmallandbig

collaborationsalike.

2. Softwareandeconomicsustainability.

3. Robustness(nosinglepointsoffailure).

4. Modular,scalablearchitecture.

5. Opensourcesoftware,vendorindependence,

hybridinfrastructures.4INDIGO-DataCloud RIA-653549

Page 5: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

Userscommunities

Integrating distributed data infrastructures with INDIGO-DataCloud 5

Page 6: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

INDIGOWPs

Integrating distributed data infrastructures with INDIGO-DataCloud 6

WP4$–$Resource$Virtualiza3on$$Leader&Partner:&&DESY,&KIT&

WP5$–$PaaS$Pla7orm$Leader&Partner:&&INFN,&Cyfronet&

WP6$–$Portal,$Workflows$and$User$Interfaces$

Leader&Partners:&PSNC,&INFN&

WP1$–$Administra3ve$and$Technical$Management$Leader&Partners:&&INFN,&CSIC&

WP2

$–$Defi

ni3o

n$of$Sup

port$to

$Research$Co

mmun

i3es$$

Leader&Partners:&Life

Watch,&EGI.eu&&&

Final$product/Deployed$App$

Feedback/Revision$

WP3$–$SoSw

are$Managem

ent$and$Pilot$services$Leader&Partner:&&LIP,&CEA&

Page 7: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

INDIGOapproach

• BasedonOpenSourcesolutions• widelysupported bybigcommunities

• wheneverpossibleexploitgeneralsolutionsinsteadofspecifictools/services• orputeffortinincreasingthegeneralityoftoolsdevelopedinagivencommunity• thiswillbeimportantforsustainability ofthearchitecture

• ensurethattheframeworkofferedtofinalusers,aswellastodevelopers,willhavealowlearningcurve• existingsoftwaresuiteslikeROOT,OCTAVE/MATLAB,MATHEMATICAorR-STUDIO,willbesupportedandofferedinatransparentway

7INDIGO-DataCloud RIA-653549

Page 8: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

Scientific Computational Portal“asaService”

Use Case Portal / Access services

LBaaS

VMI repoContainer

repoScheduling

End user Appdev

SiteA

TOSCASpec

Storage

AAI

Use Case Portal / Access services

LRMS front-

end

LRMS WN

LRMS WN

SiteB

LRMS WN

LRMS WN

Storage

Site-levelScheduling

TOSCAend-point

TOSCAend-point

VPNaaS

ü ELIXIRü Haddockü CIRMMPü FedCloud

DARIAHü INAF-LBTü CMCC– ENESü CTAü ALGAE-BLOSSOM

INGV-MOISTINDIGO-DataCloud RIA-653549

Page 9: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

SoftwareasaService UseCase

Interactiveconsole

Data AnalyticsServices

VMI repoContainer

repoScheduling

End user Appdev

SiteA

TOSCASpec

Data

AAI

Data

Externalapplication

ü CIRMMPü INAF-LBTü CMCC– ENESü ALGAE-BLOSSOMü INGV-MOIST

WN

WN

TOSCAend-point

Data AnalyticsServices

SiteB

Data

DataWN

WN

TOSCAend-point

INDIGO-DataCloud RIA-653549

Page 10: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

Example:R-Studio

• R-StudioServerprovides abrowser-based interface(RStudio IDE)to aversion ofRrunning on aremote Linuxserver.• How to deploy it using INDIGO?

INDIGO-DataCloud RIA-653549

INDIGOPaaS

Partner’sCMP

VM/ContainerR-StudioServer

AuthorizationGeographical Scheduling (Brokering)Provisioning &Configuration

TOSCAPort:8

787

Requirements

§ The user selects the templatefor R-studioandprovides further requirements(optional)§ CPU,RAM,etc.

WebAccess

Public IPs &Firewall

Page 11: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

INDIGOoverallarchitecture

11

Colorcodes:Yellow:implementationbasedonalreadyavailablesolutiontobeimproved/changed;

Orange:Completelynew services tobeimplemented

Page 12: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

InterfacesamongWPs

INDIGO-DataCloud RIA-653549

WP4

WP5

WP6

TOSCA

TOSCA

TOSCA

CloudServices

Applicationssubmission

DataManagement/

Access

CloudServices/application execution

LocalResource

REST/WebDav/Posix

Heat/IM

Storageaccess

REST/WebDav/Posix

WP4

RESTWebdav/Posix/REST

This will bedescribed inthis talk

Page 13: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

WP5:PaaS PlatformDevelopment

13

• PaaS layer is provided byWP5

• It is realized bymeans ofahigh-availableandscalable µServicescluster

• Hosted byKubernetes

• Itisabletoprovidetwomainapproach:

• High-LevelautomatizedIaaS

• RealPaaS servicesandapplicationexecution

INDIGO-DataCloud RIA-653549

653549&INDIGO,DataCloud&–&Part&B& & 31&

Figure 14: WP5 architecture

Page 14: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

Integrating distributed data infrastructures with INDIGO-DataCloud 14

Page 15: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

15

Deployment of Customized Virtual Infrastructure using INDIGO

Orchestrator

Service

Brokering/Policy

Service

QoS/SLA

Service

Managed Services/

Applications (MSA)

Service

Monitoring

Service

Data

Management

Services

Infrastructure

Manager

TOSCA

IAM Service

GridHPC Clusters

SAGA

POSIX /

WebDavSAGA

POSIX /

WebDav

EUDAT

CDMI

GUI-based

Portlets

Repository

TOSCA-

compliant

Templates

1. Access

2. Authenticate

3. Select /

Customize

9.a Delegate

Deploy

5. Status

6. Prioritize

7. Agree

8.a Transfer

(Optional)

8.b. Transfer

(Optional)

4. Deploy

9.b.1

Delegate

Deploy

External Cloud

Cloud APIs CDMI

Partner IaaS

OpenNebula

IM

CDMI

OpenStack

HEAT

TOSCA

9.b.2

Provision &

Configure

...

Virtual Infrastructure for

User Application/Service

10. Deliver VMs

...

INDIGO-DataCloud RIA-653549

Page 16: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

16

Deployment of a Managed Service/Application using INDIGO

Orchestrator

Service

Brokering/Policy

Service

QoS/SLA

Service

Managed Services/

Applications (MSA)

Service

Monitoring

Service

Data

Management

Services

Infrastructure

Manager

TOSCA

IAM Service

GridHPC Clusters

SAGA

POSIX /

WebDavSAGA

POSIX /

WebDav

EUDAT

CDMI

GUI-based

Portlets

Repository

TOSCA-

compliant

Templates

1. Access

2. Authenticate

3. Select /

Customize

5. Status

6. Prioritize

7. Agree

4. Deploy

External Cloud

Cloud APIs CDMI

Partner IaaS

OpenNebula

IM

CDMI

OpenStack

HEAT

... 10. Deliver WNs

Mesos

8. State

9.a Provision

Worker Nodes

Master Master

Worker

...

9.b.1 Delegate

Deploy

Worker

App/

Service/

Job

Mesos Cluster

11. Deploy

9.b.2

Provision &

Configure

TOSCA

INDIGO-DataCloud RIA-653549

Page 17: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

On Standards

• OASISTOSCA (Topology andOrchestrationSpecification for CloudApplications)1.0(11/2013)• Interoperabledescription ofapplicationandinfrastructurecloud services,the relationshipsbetween parts ofthe service,andthe operational behavior ofthese services (e.g.,deploy,patch,shutdown)independentofthe supplier creating theservice,andany particularcloud provider or hostingtechnology.

INDIGO-DataCloud RIA-653549

WP4

WP5

WP6

TOSCA

TOSCA

TOSCA

Source:http://www.slideshare.net/OpenTOSCA/tosca-and-opentosca-tosca-introduction-and-opentosca-ecosystem-overview?ref=http://www.slideshare.net/OpenTOSCA/slideshelf

Page 18: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

Moreon TOSCA(I)

INDIGO-DataCloud RIA-653549Source:https://wiki.openstack.org/w/images/a/a1/TOSCA_in_Heat_-_20130415.pdf

Page 19: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

OnMesos /Marathon /Chronos

• Marathon/Chronosrunon topofaMesos cluster• Enables scale-outofservices by deployingadditional containerson sparenodes.

• Marathonwill beused toinstantiate the LongRunningServices andkeep them monitored andrunning• It will take care ofrecover incaseoffailure

• Chronoswill beused inorder todeal withapplication execution:• It is able todeal with dependency/re-executing theapplication incaseoffailure

• TheMesos cluster will beused inorder tosharetheresources amongdifferentuse-cases

INDIGO-DataCloud RIA-653549

Page 20: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

EnhancedIaaS

20INDIGO-DataCloud RIA-653549

WP4.1 architecture

October 9, 2015

INDIGO-Datacloud RIA-653549 12

Page 21: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

IaaS onsteroids

• Featuresmissingin(some)OpenSource IaaS CloudManagementFrameworks• Lackorinsufficientsupportforrunningcontainersaslightweightresources• Naïveschedulingpolicies.• Lackofsupportforstandards-basedorchestration

• Objectives• ImproveIaaS toaccommodatePaaS workloads• Aimingforfeaturesinterestingforthebroadercommunity,notonlyfirtheproject• →integrationwithupstreamtools.

21INDIGO-DataCloud RIA-653549

Page 22: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

Lackofcontainersupport

• Containerscanbeusedaslightweightvirtualmachines.• Offerbetterportability.• Lowerperformanceoverhead.

• SupportinOpenStack inplace,butfeaturesmissing.

• SupportinONEnotavailable.• Developanewdrivertopromotecontainersasfirst-classresources.

• Notaimingoncontainerorchestration(e.g.Kubernetes).• Thiswillbedonebyothercomponentsandareas

22INDIGO-DataCloud RIA-653549

Page 23: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

Naïvescheduling

• CMFdevelopmentsdrivenbyindustryneeds.• NormallyschedulingisdoneonaFIFObasis.

• Scientificapplicationshaveuniqueresourceprovisioningrequirements.

• Userchallenges:• Staticquotas,difficulttoovercomming.• Opportunisticusagenotpossible.

• Resourceproviderchallenges:• Lackofproperdynamicpartitioning→notpossibletogetagoodutilization• Notpossibletoprioritizedifferentworkloads.

• Twodifferentapproachs:fair-shareschedulingandspotinstancessupport23INDIGO-DataCloud RIA-653549

Page 24: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

Fair-sharescheduling

• VMsdonothaveaduration.• Theycannotbeterminated

• Noqueuingmechanism.• Difficulttoprioritizetasks.

• Synergyisbeingdeveloped.• Externalservicethatinterceptsuserrequests.• InstanceslaunchedtroughSynergyhaveareassociatedwithaWALLtime.• Theyqueuedaccordingtoprevioususage(fair-sharing).• TheyareterminatedaftertheWALLtimeisexpired.

24INDIGO-DataCloud RIA-653549

Page 25: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

EnhancedIaaS

25INDIGO-DataCloud RIA-653549

Fair-share scheduling III

October 9, 2015

INDIGO-Datacloud RIA-653549 23

Page 26: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

Spot-instancessupport

• Implementedincommercialcloudofferings:AWSandGoogleCloudEngine.

• Instancesthatcanbeterminatedbyhigherpriorityrequests→prioritycanbedeterminedbythebidprizeofbyanothermechanism.

• Theycanbealwaysterminatedbynon-spotinstances.

• Itmakespossibleopportunisticusageofthecloud

• Userscanfilltheidleresources,buthigherprioritytaskcanenter.

26INDIGO-DataCloud RIA-653549

Page 27: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

Standards-basedorchestration

• ForOpenStack,HeatusestheHeatOrchestrationTemplate(HOT).

• ForONE,theInfrastructureManager(IM)usestheResourceandApplicationdescriptionLanguage(RADL).

• SupportforusingTOSCAinbothorchestratorsisbeingdeveloped:• Approach:translationbetweenTOSCAtemplatesandthenativelanguages→initialtranslatoralreadyavailableforOpenStack• InitialsupportattheCLIlevel• StudyingthepossibilityenablingAPIstoreceiveTOSCAdocuments.

27INDIGO-DataCloud RIA-653549

Page 28: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

Progresssofar(IaaS)

• Containersupport• InitialdriverforONEhasbeendeveloped.• OpenStack testbed beingdeployed.• OCCIsupportongoing.• Fair-sharescheduling• Preparingcodeforinitialpublishing.• Testbed deployedandcurrentlyundertesting.• ProblemswithSLURMPriorityMulti-factor,adaptingtoanewalgorithm.

28INDIGO-DataCloud RIA-653549

Page 29: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

Progresssofar(IaaS)

• Standards-basedorchestration• ContactswithOpenStack developerstoincludeTOSCAonHeat• ImprovedTOSCA-parserstosupportINDIGOusecases

• Spot-instancessupport• InitialblueprintpresentedtoOpenStack,butdiscussionstillongoing.• Initialprototypeworking.• StudyinghowtoexposefunctionalitytroughOCCI.

29INDIGO-DataCloud RIA-653549

Page 30: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

Progresssofar(IaaS)

• Standards-basedorchestration• ContactswithOpenStack developerstoincludeTOSCAonHeat• ImprovedTOSCA-parserstosupportINDIGOusecases

• Spot-instancessupport• InitialblueprintpresentedtoOpenStack,butdiscussionstillongoing.• Initialprototypeworking.• StudyinghowtoexposefunctionalitytroughOCCI.

30INDIGO-DataCloud RIA-653549

Page 31: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

Progresssofar(PaaS)

• PaaS Architecturefullydefined• PaaS Orchestratortechnologyisalreadydefined• Implementationofthefirstworkflowisalreadygoingon

• Defineadditionalnon-normativenodetypesforTOSCA

• Monitoringtechnologydefined

• TechnologyfordeployingofLongRunningServicesandApplicationisdefined

• HostingofµServices31 INDIGO-DataCloud RIA-653549

Page 32: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

Conclusions

• Firstofficialreleasewillbe:endofJulynextyear

• Wewillstartmakeavailablesomeservicesassoonastheyarereadyenoughtobetested

• Allthechangesonthealreadyavailableprojects,willbepushedbacktotheofficialreleases.• OpenStack,OpenNebula,dCache,Onedata,Mesos,Accounting,QoS/SLA,etc

32 INDIGO-DataCloud RIA-653549

Page 33: INDIGO Computing IaaS PaaS · INDIGO approach • Based on Open Source solutions • widely supported by big communities • whenever possible exploit general solutions instead of

Questions?Giacinto [email protected]

33INDIGO-DataCloud RIA-653549