The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern...
Transcript of The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern...
![Page 1: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/1.jpg)
TheScienceDMZ:RecentDevelopments
EliDart,NetworkEngineerESnetScienceEngagementLawrenceBerkeleyNa@onalLaboratory
WRNP17
Belém,Brazil
May16,2017
©2017,EnergySciencesNetwork
![Page 2: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/2.jpg)
Overview
• ScienceDMZAsPlaMorm• ModernResearchDataPortal
• PacificResearchPlaMorm– PRP– NRP
• Note:ThistalkassumesyoualreadyunderstandtheScienceDMZ
– Ifyouhaven’tencounteredtheScienceDMZ,severalfolksinRNPcanhelpyou,includingLeandroCiuffoandAlexMoura
– Orcheckoutthefasterdataknowledgebase:• hXp://fasterdata.es.net/science-dmz/
2 – ESnet Science Engagement ([email protected]) - 5/15/17
©2017,EnergySciencesNetwork
![Page 3: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/3.jpg)
• OncetherearemanyScienceDMZsinyournetwork,morethingsbecomepossible
• Easyfiletransferisgood,butwhatelsecanwedo?– Updatethearchitectureofdataportals– Buildservicesbetweenins@tu@ons– Interconnectfacili@es
• Severaleffortsunderwaytodothesethings
ScienceDMZAsAPla3orm
3 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
![Page 4: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/4.jpg)
ScienceDataPortals
• Largerepositoriesofscien@ficdata– Climatedata– Skysurveys(astronomy,cosmology)– Manyothers– Datasearch,browsing,access
• Manyscien@ficdataportalsweredesigned15+yearsago– Single-web-serverdesign– Databrowse/search,dataaccess,userawarenessallinasinglesystem– Allthedatagoesthroughtheportalserver
• Inmanycasesbydesign• E.g.embargobeforepublica@on(enforceaccesscontrol)
4 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
![Page 5: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/5.jpg)
LegacyPortalDesign
10GE
Border Router
WAN
Firewall
Enterprise
perfSONAR
perfSONAR
Filesystem(data store)
10GE
Portal Server
Browsing pathQuery pathData path
Portal server applications:· web server· search· database· authentication· data service
• Verydifficulttoimproveperformancewithoutarchitecturalchange– Sodwarecomponentsalltangledtogether
– DifficulttoputthewholeportalinaScienceDMZbecauseofsecurity
– EvenifyoucouldputitinaDMZ,manycomponentsaren’tscalable
• Whatdoesarchitecturalchangemean?
5 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
![Page 6: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/6.jpg)
ExampleofArchitecturalChange–CDN
• Let’slookatwhatContentDeliveryNetworksdidforwebapplica@ons• CDNsareawell-deployeddesignpaXern
– Akamaiandfriends– En@reindustryinCDNs– Assumedpartoftoday’sInternetarchitecture
• WhatdoesaCDNdo?– Storesta@ccontentinaseparateloca@onfromdynamiccontent
• Complexityisn’tinthesta@ccontent–it’sintheapplica@ondynamics• Webapplica@onsarecomplex,full-featured,andslow– Databases,userawareness,etc.– Lotsofintegratedpieces
• Dataserviceforsta@ccontentissimplebycomparison
– Separa@onofapplica@onanddataserviceallowseachtobeop@mized
6 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
![Page 7: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/7.jpg)
ClassicalWebServerModel
• Webbrowserfetchespagesfromwebserver– Allcontentstoredonthewebserver– Webapplica@onsrunonthewebserver
• Webservermaycallouttolocaldatabase• Fundamentallyallprocessingislocaltothewebserver
– Webserversendsdatatoclientbrowseroverthenetwork• Perceivedclientperformancechangeswithnetworkcondi@ons
– Severalproblemsinthegeneralcase– Latencyincreases@metopagerender– Packetloss+latencycauseproblemsforlargesta@cobjects
HostingProvider
TransitNetwork
Residential BroadbandWEB
Long Distance / High Latency
Web Server
Browser
7 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
![Page 8: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/8.jpg)
SoluFon:PlaceLargeStaFcObjectsNearClient
HostingProvider
TransitNetwork
Residential BroadbandWEB
Long Distance / High Latency
CDN
DATA
Short Distance / Low Latency
Web Server
CDN Data Server
Browser
• CDNprovidessta@ccontent“close”toclient– Latencygoesdown
• Timetopagerendergoesdown• Sta@ccontentperformancegoesup
– Loadonwebservergoesdown(noneedtoservesta@ccontent)
– Webservers@llmanagescomplexbehavior• Localreasoning/fastchangesforapplica@onowner
• Significantwinforwebapplica@onperformance
8 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
![Page 9: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/9.jpg)
ClientSimplySeesIncreasedPerformance
• Clientdoesn’tseetheCDNasaseparatething– Webcontentisalls@llviewedinabrowser
• Browserfetcheswhatthepagetellsittofetch• Differentcontentcomesfromdifferentplaces• Userdoesn’tknow/care
• CDNsprovideanarchitecturalsolu@ontoaperformanceproblem– Notbrute-force– Worksmarter,notharder
The‘NetWEB
Browser
Web Server
Rich, Slow
DATA
CDN Data Server
Simple,Fast
The‘NetWEB
Browser
Web Server
©2017,EnergySciencesNetwork9 – ESnet Science Engagement ([email protected]) - 5/15/17
![Page 10: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/10.jpg)
ArchitecturalExaminaFonofDataPortals
• Commondataportalfunc@ons(mostportalshavethese)– Search/query/discovery– Datadownloadmethodfordataaccess– GUIforbrowsingbyhumans– APIformachineaccess–ideallyincorporatessearch/query+download
• Performancepainisprimarilyinthedatahandlingpiece– Rapidincreaseindatascaleeclipsedlegacysodwarestackcapabili@es– Portalserversodenstuckinenterprisenetwork
• Canwe“disassemble”theportalandputthepiecesbacktogetherbeXer?– UseScienceDMZasaplaMormforthedatapiece– AvoidplacingcomplexsodwareintheScienceDMZ
10 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
![Page 11: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/11.jpg)
LegacyPortalDesign
10GE
Border Router
WAN
Firewall
Enterprise
perfSONAR
perfSONAR
Filesystem(data store)
10GE
Portal Server
Browsing pathQuery pathData path
Portal server applications:· web server· search· database· authentication· data service
11 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
![Page 12: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/12.jpg)
Next-GeneraFonPortalLeveragesScienceDMZ
10GE10GE
10GE
10GE
Border Router
WAN
Science DMZSwitch/Router
Firewall
Enterprise
perfSONAR
perfSONAR
10GE
10GE
10GE10GE
DTN
DTN
API DTNs(data access governed
by portal)
DTN
DTN
perfSONAR
Filesystem (data store)
10GE
Portal Server
Browsing pathQuery path
Portal server applications:· web server· search· database· authentication
Data Path
Data Transfer Path
Portal Query/Browse Path
12 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
![Page 13: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/13.jpg)
PutTheDataOnDedicatedInfrastructure
• Wehaveseparatedthedatahandlingfromtheportallogic• Portaliss@llitsnormalself,butenhanced
– PortalGUI,database,search,etc.allfunc@onastheydidbefore– QueryreturnspointerstodataobjectsintheScienceDMZ– Portalisnowfreedfrom@estothedataservers(runitonAmazonifyouwant!)
• Datahandlingisseparate,andscalable– High-performanceDTNsintheScienceDMZ– Scaleasmuchasyouneedtowithoutmodifyingtheportalsodware
• Outsourcedatahandlingtocompu@ngcentersorcampuscentralstorage– Compu@ngcentersaresetupforlarge-scaledata– Letthemhandlethelarge-scaledata,andlettheportaldotheorchestra@onofdataplacement
13 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
![Page 14: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/14.jpg)
The Pacific Research Platform Creates a Regional End-to-End Science-Driven “Big Data Freeway System”
NSF CC*DNI Grant $5M 10/2015-10/2020
• PI: Larry Smarr, UC San Diego Calit2
• Co-PIs: - Camille Crittenden, UC
Berkeley CITRIS, - Tom DeFanti, UC San Diego
Calit2, - Philip Papadopoulos, UC
San Diego SDSC, - Frank Wuerthwein, UC San
Diego Physics and SDSC
![Page 15: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/15.jpg)
PRPProvidesInteroperability
• ScienceDMZsatpar@cipa@ngsitesensureinteroperability• PRPengineersworktoensuretheyinteroperate
– GlobusdatatransferbetweenDTNs– perfSONAR
• Somevaria@oninDTNs– SomehaveFIONADTNs
• FIONA==FlashI/ONetworkAppliance• DesignedbyPRPengineersatUCSanDiego• hXps://fasterdata.es.net/science-dmz/DTN/fiona-flash-i-o-network-appliance/
– SomehaveDTNsconnectedtoHPCstorage• Key–theyallinteroperate,removingintegra@onburdenfromscien@sts
15 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
![Page 16: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/16.jpg)
PRPScienceDrivers
• Mul@plescienceareas– Astronomyandastrophysics– Biomedicalapplica@ons– Lifesciences– Par@clephysics– Virtualrealityanddatavisualiza@on
• hXp://prp.ucsd.edu/
5/15/1716
![Page 17: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/17.jpg)
NaFonalResearchPla3orm(NRP)
• ReplicatethePRPonana@onalscale• Interoperable,high-performancecyberinfrastructure
– Builttoservedomainscience– Scaleupto~200ins@tu@ons
• Firstworkshoptobeheldthissummer– Domainscienceinput– Policyques@ons– Architecture,scalability– IncludecampusIT,regionalnetworks,na@onalnetworks,fundingagencies,etc.inacommonconversa@on.
17 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
![Page 18: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/18.jpg)
PetascaleDTNProject
• AnotherexampleofbuildingontheScienceDMZ• Supportsalldata-intensiveapplica@onswhichrequirelarge-scaledataplacement
• Collabora@onbetweenHPCfacili@es– ALCF,NCSA,NERSC,OLCF
• Goal:per-Globus-jobperformanceat1PB/weeklevel– 15gigabitspersecond– Withchecksumsturnedon,etc.– Nospecialshortcuts,noarcaneop@ons
• Referencedatasetis4.4TBofastrophysicsmodeloutput– Mixoffilesizes– Manydirectories– Realdata!
18 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
![Page 19: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/19.jpg)
PetascaleDTNProject
10.0 Gbps
17.6 Gbps
14.8 Gbps
19.3 Gbps
17.4 Gbps 17.0 Gbps
32.4 Gbps
25.3 Gbps
18.3 Gbps
16.3 Gbps
24.1 Gbps
24.0 Gbps
DTN
DTN
DTN
DTN
alcf#dtn_miraALCF
nersc#dtnNERSC
olcf#dtn_atlasOLCF
ncsa#BlueWatersNCSA
Data set: L380Files: 19260Directories: 211Other files: 0Total bytes: 4442781786482 (4.4T bytes)Smallest file: 0 bytes (0 bytes)Largest file: 11313896248 bytes (11G bytes)Size distribution:
1 - 10 bytes: 7 files10 - 100 bytes: 1 files100 - 1K bytes: 59 files1K - 10K bytes: 3170 files10K - 100K bytes: 1560 files100K - 1M bytes: 2817 files1M - 10M bytes: 3901 files10M - 100M bytes: 3800 files100M - 1G bytes: 2295 files1G - 10G bytes: 1647 files10G - 100G bytes: 3 files
March 2017L380 Data Set
19 – ESnet Science Engagement ([email protected]) - 5/15/17 ©2017,EnergySciencesNetwork
![Page 20: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/20.jpg)
Thanks!
[email protected](ESnet)LawrenceBerkeleyNa@onalLaboratory
hXp://fasterdata.es.net/
hXp://my.es.net/
hXp://www.es.net/
![Page 21: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/21.jpg)
ExtraSlides
5/15/1721
![Page 22: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/22.jpg)
WhatIsScienceEngagement?
• Technologypeopleworkingwithscien@ststohelpsolveproblems– Improvedatatransferperformance– Improvedataworkflows(e.g.torequirelesshumaneffort)– Improveexperimentopera@ons– …andmore…
• Usingexperiencegainedfromhelpingscien@ststoimprovecyberinfrastructure– Networkdesign– Tooldesign– Systemdesign
5/15/1722
![Page 23: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/23.jpg)
EngagementIsImportant:OldModel
• Scien@stasintegrator– Requiresscien@ststodiscovernewtechnologies– Requiresscien@ststobecomeexpertinnewtechnologies– Requiresscien@ststoassembledis@ncttechnologiesintoanintegratedsolu@onthatworksforthem
– Somescien@stsdothisbrilliantly–mostdonot
5/15/1723
![Page 24: The Science DMZ: Recent Developments · 16/05/2017 · • Science DMZ As Plaorm • Modern Research Data Portal • Pacific Research Plaorm – PRP – NRP • Note: This talk](https://reader035.fdocuments.net/reader035/viewer/2022070723/5f0205cd7e708231d402314a/html5/thumbnails/24.jpg)
EngagementIsImportant:NewModel
• Scien@stascollaborator– Technologistsunderstandtechnology– Technologistsunderstandenoughofthesciencetoseehowtechnologyfits
– Technologistshelpscien@stsadoptausefulsolu@on– Thisismuchmoreproduc@ve,andrequiresscienceengagement
5/15/1724