Post on 23-Feb-2016
description
1
Supporting Research With Flexible Computation Resources
David WallomAssociate Director – Innovation, Oxford e-Research Centre
Technical Director – UK NESFormer VP Communities - OGF
- Federating Clouds in the UK NES, Oxford e-Research Centre and leading to EGI
SDCD 2012: Supporting Science with Cloud Computing 19th November 2012
2
UK NGS Cloud Activities
• NGS Agile Deployment EnvironmentsEPSRC funded, 2 years
• Staff:– David Wallom (OeRC, Oxford);– David Fergusson (NeSC, Edinburgh);– Steve Thorn (NeSC, Edinburgh);– Matteo Turilli (OeRC, Oxford).
• Goals:– EC2 compatible, open source solution;– development of a dedicated pool of images, supporting both end
user and NGS requirements such as training;– collecting data about feasibility, costs, stability;– identify use cases and gather further requirements.
3
Cloud Infrastructure for Research
Centralisation vs Federation• Centralisation: one large, dedicated datacentre that
serves the national HEI demand• Federation: heterogeneous set of local infrastructures
coordinated nationally in order to satisfy the HEI demand
Evaluation criteria• Funding• Scalability• Flexibility• Maintenance• Support
• Accountability• Obsolescence• Competitiveness• Security
4
Eucalyptus Vs Nimbus, OpenNebula, OpenStack
Eucalyptus Pros
• Very good implementation of EC2 and EBS APIs;
• Enterprise support offered by Canonical through UEC;
• Dedicated installation in UEC;
• Modular design;
• Xen and KVM compatible;
• Open source and commercial.
Eucalyptus Cons• Design limitations;• AAA.
The others• Limited EC2 API
implementation;• No native support for EBS;• Globus WS4 (Nimbus);• Early development stage;• Slow development.
• To keep an eye on
• OpenNebula 2.2 (to be tested);
• OpenStack Compute and OpenStack Object Storage.
5
NGS Cloud Prototypes
Oxford III• 6 x 2 AMD 2 core; 8GB ram.• 1 x 4 AMD 2 core; 32GB
ram.• CentOS 5.4;• Eucalyptus 1.6.2 installed
from rpm repositories;• Ganglia and Nagios
monitoring systems;• 5 default VM templates =
44/44/22/22/11 VMs (editable);
• 2TB ECB, 80GB Walrus.
6
NGS Cloud Prototypes
Oxford IV• 3 x 4 Xeon 6 core; 48GB
ram.• 2 x 1 Xeon 2 core; 32GB
ram.• Ubuntu 10.10;• Ubuntu Enterprise Cloud;• 2+2 bounded public NICs
on CC;• 12TB ECB, 12TB Walrus on
SED disks;• TPM on every motherboard.
7
NGS Cloud Prototypes
Edinburgh II• 32 x Sun Fire X4100• Dual-core, 2.8 GHz Opteron
8 GB RAM, 70 GB RAID1• 64 cores• 1 Headnode (Cloud and
Cluster controllers• 31 Nodes (Node controller)• Max 2 VMs per core: 124
slots (2GB RAM)• VLANs for VM isolation
8
Managing and Monitoring
Tools
• Hybridfox + euca-tools: overall cloud usage and status + testing;
• Landscape: canonical, not open-source management solution for UEC. Did not try RightScale as fairly expensive and hosted service;
• Linux CLI: dedicated scripts to monitor logs and daemons status.
Issues
• Public IP Database corruption (addressed in version 2);
• No user quota on the open source version of Eucalyptus;
• No accounting on the open source version of Eucalyptus;
• VERY verbose, none persistent logs;
• Lack of error feedback in some conditions.
9
User Support
Tools
• Ticketing system: web-based platform (footprints). Addressed around 200 tickets in 1 year;
• Web site: subscription instructions, links to Eucalyptus documentation and to the support e-mail;
• Mailing list: used mainly to announce new services, scheduled or unscheduled downtime, planned upgrades.
Issues
• Access through institutional firewall via proxy;
• Available resources (limitation of Eucalyptus design);
• Instructions on how to build a dedicated image;
• Almost no issues about research and cloud computing.
• Difficult to manage user access with separate cloud systems…
10
NGS Cloud Usage 2010/2011
• 106 registered users: uptake has been very fast and users stayed engaged throughout the whole testing period;
• 26 institutions: 23 HEI both universities and colleges, 3 companies;
• 30 projects;• 10 research areas.
Life sciences
Teaching
Mathematics
Cloud R&D
PhysicsEcology
GeographyMedicine
Social Science
Engineering
11
Exemplar Case Studies
• Evolutionary Genomics: “analysis and Information management of Next Generation Sequencing (NGS) of Genomic data poses many challenges in terms of time and size. We are exploring the translation of high quality NGS scientific analysis pipelines to make best use of Cloud infrastructure”;
• Geospatial Science: “geospatial data is a mix of raster and vector data. As rasterizing is CPU-hungry process, and all maps displayed on the screen of the final user are rasters, it is more efficient to do the process on the server side. I am investigating how this process can be dispersed across many, if not unlimited instances in a cloud”;
• Agent-based modelling of crime: “at the moment I have a tomcat server that hosts some web services used to run social simulation model, it needs access to the file system to run fortran scripts, create files etc. There are loads of problems with running our own server at uni and I think a virtual machine that I could have control over would be much better”.
12
Flexible Services for the Support of Research (FleSSR)
6 Partners
• Academic and industrial;
• 3 cloud infrastructures.
Goals
Building federated cloud infrastructure, extending the use of UK NGS central services with cloud brokering and accounting.
Use cases
•Multi Platform Software Development;
•On demand Research data storage.
13
FleSSR Architecture
Oxford Reading
Eduserv
Zeel/i Broker
STFC/NES Accounting Database
14
FleSSR Infrastructure
• Local/Global: services depends either on local or global access. Cloud brokering is not mandatory for AWS-like service access;
• Multiple identities: every user may have multiple identities, both local and global;
• Only personal identities: group identities are not implemented. The management of every single identity is left to the legally responsible user;
• Multiple AA technologies: AA may differ depending on local and global policies/technologies;
• Multiple accounting: every single identity is accounted for its usage. Every individual may get multiple invoices.
15
FleSSR Use Case: Multi Platform Software Development
Zeel/i Broker Instance configuration manager
FleSSR cloud
Build managerCVS / SVN repository
Build instance 1
Build instance 2
Build instance 3
Build instance 4
Build instance 5
16
FleSSR Use Case: On demand Research data storage
Zeel/i Broker Volume Manager
FleSSR cloud
VM EBS Interface
EBS Volume
17
FleSSR Output
Code
• Instance configuration and build manager: Perl command line utility + Java client utilising the Zeel/I API;
• Personal EBS volume manager: web-based, Java client for EBS volumes handling + tailored VM image with multiple data interfaces (SFTP, WebDAV, GlusterFS, rsync, ssh);
• Eucalyptus open-source accounting system: Perl aggregators and parsers for standard eucalyptus open-source log files + MySQL accounting database + PHP accounting client.
Use cases
• SKA community testing of Use case;
• Institutional ICT team testing WEB-DAV, GridFTP & GlusterFS solution as Use case 2.
18
Aiming to support multiple heterogeneous user communities, the EGI Federated
Cloud Task Force
With thanks to Matteo Turilli, EGI FCTF Chair
19
EG
I.eu
Coo
rdin
atio
nC
ore
softw
are
and
supp
ort
gLite UNICOREdCache ARCCommunity
Platform
IaaS NGI
VM Mgmt Data Image Sharing
Monitoring Accounting Notification
NGI
Monitoring Accounting Notification
EGI-wide message bus
NGI
VM Mgmt Data Image Sharing
Monitoring Accounting Notification
Commercial
VM Mgmt DataImage
Sharing
Monitoring Accounting Notification
Personalised environments for individual research communities in the European Research Area.
Community Services
Community Services
Globus
Globus
EGI New Challenges and Cloud Computing
With thanks to Matteo Turilli, EGI FCTF Chair
20
BSC
CNRS LMU
OeRC
Masaryk
TUD
IFAE
Cyfronet
SixSq
CESNET
TCD
SRCE
DANTE
FZJ
GRNET
GWDG
Utrecht
STFC
SARA KTH
INFN
FCTSG
EGI.eu
Task Force Members and Technologies
Members• 63 individuals.• 23 institutions.• 13 countries.
Technologies• 7 OpenNebula.• 3 StratusLab.• 3 OpenStack.• 1 Okeanos.• 1 WNoDeS.
Stakeholders• 15 Resource Providers.• 7 Technology Providers.• 6 User Communities.• 3 Liaisons.
With thanks to Matteo Turilli, EGI FCTF Chair
21
Federation Model
HardwareHardware
HardwareHardware
Hardware
Cloud ManagementCloud Management
Cloud ManagementCloud Management
Cloud Management
User CommunitiesUser Communities
User Communities
Federated interfaces
Federated services
• Standards and validation: emerging standards for the interfaces and images – OCCI, CDMI, OVF.
• Resource integration: Cloud Computing to be integrated into the existing production infrastructure.
• Heterogeneous implementation: no mandate on the cloud technology.
• Provider agnosticism: the only condition to federate resources is to expose the chosen interfaces and services.
With thanks to Matteo Turilli, EGI FCTF Chair
22
Federation Test bed – Sep 2012
Composed of 4 services, 2 management interfaces, 7 cloud infrastructures operated by 6 Resource Providers. 3 more providers are in the process of being federated.
23
InformationGLUE 2.0
BDII
MonitoringNagios
AccountingOGF UR
UR+ & StAR
Message Bus
VM metadataMarketplace
Resource ProviderVenus-C CDMI 1.0
Federation Demo – Sep 2012Resource Provider
GWDG (ON/OS) OCCI 1.1
CDMI 1.0MP/UR Clients
Resource ProviderCESNET (ON)
OCCI 1.1
CDMI 1.0
Resource ProviderCYFRONET (ON)
OCCI 1.1
Resource ProviderKTH (ON)
OCCI 1.1
CDMI 1.0
Resource ProviderCESGA (ON)
OCCI 1.1
Resource ProviderFZJ (OS)
OCCI 1.1LDAP
MP/UR Clients
LDAP
ON = OpenNebula.
OS = OpenStack.
MP = Marketplace.
UR = Usage Records.Resource Provider
IN2P3-CC (OS)OCCI 1.1LDAP
MP/UR Clients
MP/UR Clients
LDAP
MP/UR Clients
LDAP
MP/UR Clients
LDAP
MP/UR Clients
LDAP
With thanks to Matteo Turilli, EGI FCTF Chair
24
Use Cases
• Structural biology – We-NMR project: Gromacs training environments.
• Musicology – Peachnote project: music score search engine and analysis platform.
• Linguistics – CLARIN project: scalable ‘British National Corpus’ service (BNCWeb).
• Ecology – BioVel project: remote hosting of OpenModeller service.
• Software development – SCI-BUS project: simulated environments for portal testing.
• Space science – ASTRA-GAIA project: data integration with scalable workflows.
With thanks to Matteo Turilli, EGI FCTF Chair
25
EGI FCTF Conclusions
Output• Adoption of standards for VM and data management.• Interoperability across multiple cloud management platforms.• Federation model compatible and consistent with current EGI infrastructure.• Contribution to EGI user communities engagement and support.• Documentation made available to the community.
Cycle #3, Sep 2012 – Mar 2013: Integration• Focus on dev tools for management interfaces and clients for the test bed.• Integration of the test bed services into the EGI infrastructure.• Cloud brokering evaluation and deployment.• Focus on use cases coordination and implementation.• Opening of the test bed to early adopters.
With thanks to Matteo Turilli, EGI FCTF Chair
26
Usage so far
• Compute Capacity– >900 VM slots
• Data– ~16TB
• Marketplace– 11 VM templates stored and available
• VM instantiation/Usage– >3200 VMs (Accounted for in EGI central accounting facility)
With thanks to Matteo Turilli, EGI FCTF Chair
27
Federation Conclusions
• Utilisation of virtual infrastructure is the only scalable method to support large number of disparate user communities across multiple different application design models
• Federation as robust and scalable model of national/European cloud infrastructure for research,
• Federation is only possible by the availability of open standards,
• Successful pilot tests of multiple prototypes of cloud infrastructure allowed a quicker development of the final model for EGI,
• Crucial role played by Research & Development in order to customise open-source cloud infrastructure solutions to the specific needs of academic research,
• Cloud is part of an ecosystem of e-infrastructure not e-infrastructure alone.
28
Questions?