Einsatz von UNICORE in Rechenzentren · UNICORE UNIform Access to COmputing REsources Site: a...
Transcript of Einsatz von UNICORE in Rechenzentren · UNICORE UNIform Access to COmputing REsources Site: a...
Mit
glie
dd
erH
elm
hol
tz-G
emei
nsc
haf
t
Einsatz von UNICORE inRechenzentren
2017-03-16 Bjorn Hagemeier
Part: About Us
2017-03-16 Bjorn Hagemeier Folie 2
Forschungszentrum Julich and JSC
2017-03-16 Bjorn Hagemeier Folie 3
Forschungszentrum Julich and JSC
2017-03-16 Bjorn Hagemeier Folie 3
JUQUEEN
IBM Blue Gene/Q
28 racks, 458,752 cores
PowerPC A2 a.6GHz
16 cores per node
5.8 Petaflop/s peak
460 TByte main memory
5D network
2017-03-16 Bjorn Hagemeier Folie 4
JURECA
1872 compute nodes
Intel Haswell with 2x12 cores @2.5GHz75 compute nodes equipped with 2NVIDIA K80 GPUsDDR4 memory (2133MHz)
1605 nodes with 128GiB memory128 nodes with 256 GiB memory64 nodes with 512 GiB memory
12 visualization nodes
2 NVidia K40 per nodes10 nodes with 512 GiB memory2 nodes with 1024 GiB memory
Total of 45,216 cores
100 GiB/s storage connection
2017-03-16 Bjorn Hagemeier Folie 5
JUSTJuelich Storage Cluster
IBM GPFS
20.3 PB online storage
220 GB/s
Fileserver for
HPC-Systems: JUQUEEN,JURECADEEP (Dynamical Exascale EntryPlatform)
2017-03-16 Bjorn Hagemeier Folie 6
Tape Libraries
Actual capacity: ∼99 PB
Theoretical capacity: 141 PB (16600x8.5TB)
Tape drives: 48
Libraries: 2
2017-03-16 Bjorn Hagemeier Folie 7
Part: UNICORE
2017-03-16 Bjorn Hagemeier Folie 8
UNICORE As We See It Today
A federation software suite
Secure and seamless access to compute and data resources
Focus on scientific applications and workflows
Complies with typical HPC centre policies
Complete solutions: APIs, clients, services, ...
Java/Python based, supports UNIX, MacOS, Windows andmany resource management systems (Torque, Slurm, SGE, ...)
Long development history (since 1997)
Open source, BSD licensed, visit http://www.unicore.eu
2017-03-16 Bjorn Hagemeier Folie 9
Concepts
UNICORE ≡ UNIform Access to COmputing REsources
Site: a resource such as an HPC system including storage
Job: submitted through JSDL including data staging,resource requirements, executable definition and parameters
Hadoop (Yarn) jobs possible, too, in conjunction with HDFS
Resources: features of sites in terms of capacity and capability
Storages: a view into file systems at a certain base directory(mount point). Can be storage external to the site, e. g.Swift, S3, HDFS, CDMI, XtreemFS
Applications: abstractions of applications hiding site-localspecificities, e. g. installation paths or module activations
Workflows: a series of job executions guided by controlstructures, i. e. visual programming
2017-03-16 Bjorn Hagemeier Folie 10
Architecture
2017-03-16 Bjorn Hagemeier Folie 11
UnicoreMain Services
Compute
TargetSystemFactory
TargetSystem
JobManagement
Reservations
Storage and Data
StorageFactory
StorageManagement
FileTransfer
Metadata
Workflow
Workflowenactment
Task Execution ResourceBroker
Registry
2017-03-16 Bjorn Hagemeier Folie 12
Default Setup
Access to resource manager and file system viaTargetSystemInterface (TSI) daemon installed on the clusterlogin node(s)
2017-03-16 Bjorn Hagemeier Folie 13
Job Execution
2017-03-16 Bjorn Hagemeier Folie 14
Storage Access
The UNICORE Storage ManagementService (“SMS”) provides a filesystem-likeview of data
Typical functions
mkdir, delete, ls, chmod etc
Start tile transfers
Import/export of data from/to the user’slocal machineSend/receive of data from other serversVarious supported file transfer protocols
2017-03-16 Bjorn Hagemeier Folie 15
File Transfers
2017-03-16 Bjorn Hagemeier Folie 16
Metadata Management ServiceMMS
Automatic extraction
Manual editing of metadata
Searching
2017-03-16 Bjorn Hagemeier Folie 17
Applications
General
Identified by name and version
Site specifics
Pre and post commands for environment setup and tear down
Acquire and return licenses
MPI
Support for application metadata
2017-03-16 Bjorn Hagemeier Folie 18
Generic ApplicationsAutomated generation of GUIs
UNICORE Rich Client and Portal support application metadata
Example
<jsdl:Argument Description="Check input file"
Type="boolean"
Default="..."
ValidValues="true false"
DependsOn="..."
Excludes="..."
IsEnabled="false"
IsMandatory="false">+v$CHECK?</jsdl:Argument>
Possible types: string, boolean, int, double, filename, choice.
Used to be defined by site administrators.
2017-03-16 Bjorn Hagemeier Folie 19
User-Defined Applications
Allow mixing system and userdefined applications
Encourage users to play with anddevelop their own applicationdefinitions
Repository of common applicationdefinitions
Realized by merging system anduser specific IDB contributions
Users cannot change a site’sresources and thus not go beyondadministrator limits
2017-03-16 Bjorn Hagemeier Folie 20
Workflow Features
Simple graphs (DAGs)
Workflow variables
Loops and control constructs
while, for-each, if-else
Conditions
Exit code, file existence, filesize, workflow variables
Clients UNICORE Rich clientCommandline client
2017-03-16 Bjorn Hagemeier Folie 21
Authentication and AuthorizationAAI, for short
In addition to its own,home-grown usermanagement solution, aka.XUUDB, UNICORE supportsSAML-based authentication.
PULL and PUSH-mode arepossible
Typically only need a fewattributes
role (user, server, admin),xlogins, groups
2017-03-16 Bjorn Hagemeier Folie 22
UNITY IdMIdentity Relationship Management
Complete solution for identity, federation andinter-federation managementCan serve as SP and IdP at the same time.
Use SAML 2, OAuth 2, OIDC, LDAP as upstream IdPsServe as IdP for SAML 2 (Web SSO, SOAP, PAOS bindings),SAML 2 Web & SOAP UNICORE Profile, OIDC, OAuth 2
2017-03-16 Bjorn Hagemeier Folie 23
UNITY IdMInfrastructures
UNICORE Portal @JSC:https://unicore-portal.fz-juelich.de:8443/
DFN AAI for authenticationStill need proper account at JSC
2017-03-16 Bjorn Hagemeier Folie 24
UNITY IdMInfrastructures
2017-03-16 Bjorn Hagemeier Folie 25
Part: Installation
2017-03-16 Bjorn Hagemeier Folie 26
InstallationGeneral
Latest releases of most important components linked on mainwebsite http://www.unicore.eu/
Detailed download section athttp://www.unicore.eu/download/ contains allcomponents
Packages are hosted on SourceForge
2017-03-16 Bjorn Hagemeier Folie 27
InstallationBasic
Core Server Bundle
https://sourceforge.net/projects/unicore/files/
Servers/Core/
Content
GatewayUNICORE/XRegistryTSIXUUDB
Requirements
OpenJDK 8 or Oracle Java 8Python 2.7 or 3.x for the TSI
2017-03-16 Bjorn Hagemeier Folie 28
InstallationWorkflow
https://sourceforge.net/projects/unicore/files/
Servers/Workflow/
Content
Workflow EngineResource broker aka. “Service Orchestrator”
2017-03-16 Bjorn Hagemeier Folie 29
InstallationFederation
Common Registry
All services need to publish their availability to a commonregistry
Can publish to multiple registries
Clients support multiple registries
Authentication
Individual registrations (certificate)
Identity federation, e. g. via UNITY
2017-03-16 Bjorn Hagemeier Folie 30
Acknowledgements
Most slides shamelessly copied from my colleague BerndSchuller.
Other team members
Valentina Huber, Andre Giesler, Maria Petrova-El Sayed, JedrzejRybicki, Rajveer Saini and many others at JSCKrzysztof Benedyczak, Marcelina Borcz, Rafa l Kluszczynski,Piotr Ba la and others at ICM / Warsaw UniversityRichard Grunzke and others at Technical University DresdenStudents: Burak Bengi, Maciej Golik, Konstantine Muradov... many others who reported bugs, suggested features,contributed code and provided patches
2017-03-16 Bjorn Hagemeier Folie 31