"Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

59
"One can't believe impossible things" UK OGSA Evaluation Project (UCL, Imperial, Newcastle, Edinburgh) (Full list of project members ) Paul Brebner University College London [email protected] "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

description

"Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites". "One can't believe impossible things" UK OGSA Evaluation Project (UCL, Imperial, Newcastle, Edinburgh) ( Full list of project members ) Paul Brebner University College London - PowerPoint PPT Presentation

Transcript of "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Page 1: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

"One can't believe impossible things"

UK OGSA Evaluation Project

(UCL, Imperial, Newcastle, Edinburgh)

(Full list of project members)

Paul Brebner

University College London

[email protected]

"Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Page 2: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Complexity – The Grid will be BIG

Page 3: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Complexity - growing

Page 4: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Complexity – built on the internet

Page 5: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Complexity – but more complex

Page 6: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Simplicity – Start with something simple

• OGSA– OGSI

• GT3.2 – exemplar of a Grid SOA

• Initially evaluate installation, configuration, and security

• Then performance and scalability, deployment, architectural choices, etc.

Page 7: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Realism – But realistic test-bed

• Heterogeneous platforms– Linux, Solaris, Windows

• Cross-organisational– Four nodes– Independently administered– Firewalls and access restrictions

• Security– UK e-Science CA

Page 8: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Confusion – What is Globus?

• How is Globus intended to be used?– 1: Science as first-order services: Middleware

for building and hosting Grid Applications, by exposing science code as Grid services.

– 2: Middleware as services: As a set of high level Grid services, composed to provide new Grid functionality. Science isn’t first-order service, but managed by Grid services.

Page 9: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Confusion – Science services or Grid services

Client

E=mc2

1

Page 10: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Confusion – Science services or Grid services

Client

E=mc2

1

D=A+2B+C2

Page 11: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Confusion – Science services or Grid services

Client

2

D=A+2B+C2

E = mc2

E=mc2

1

D=A+2B+C2

Page 12: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Confusion – How to evaluate

• Do we evaluate GT3 as middleware for hosting Grid services, or as a toolkit for constructing Grid middleware?

• If the first, only need GT3 Core – just the container. If the second, need “All Services” (and more – there’s no scheduler).

Page 13: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Simplicity – Incremental

• Start with Core Package

• Add Security

• Then try “All Services”

• Simple enough – in theory

Page 14: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Steps – single node

Install

OS/HW

GT3

Install

Page 15: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Steps – single node

Install

Configure

OS/HW

GT3

Install

Page 16: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Steps – single node

Install

Configure

Deploy

OS/HW

GT3

Install

Page 17: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Steps – single node

Install

Configure

Deploy

Run

OS/HW

GT3

Install

Page 18: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Steps – Multiple sites

GT3

Page 19: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Steps – Multiple sites

GT3 GT3 GT3 GT3

Page 20: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Steps – Multiple sites

GT3 GT3 GT3 GT3

Interoperate

Page 21: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Steps – Multiple sites

GT3 GT3 GT3 GT3

Interoperate

GT3 GT3

Secure

Page 22: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Steps – Multiple sites

GT3 GT3 GT3 GT3

Interoperate

GT3 GT3

SecureManage

Page 23: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Reality – What we found

• Port number management• Host access• Remote visibility of installation, container,

services• Installation by System Administrators• Tomcat or Test container• Compilation issues on Solaris• Exponential increase in testing complexity as

number of nodes increases.

Page 24: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Reality – What we found

• Port number management– Post number conflicts (with other services)– What port is the container running on?

Page 25: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Reality – What we found

• Host access– Is the container visible on that port externally?– From which machines?– For which users?– Non-trivial to test/debug if/when something

goes wrong

Page 26: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Reality – What we found

• Remote visibility of installation, container, services– What infrastructure is installed?– What packages and versions?– How is it configured?– What state is it in?

Page 27: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Reality – What we found

• Installation by System Administrators– Division of roles– Didn’t meet expectations– Extra effort to support multiple roles

• System Administrators – install, configure and secure

• Globus Administrators – test, maintain• Globus Developers – develop, deploy, test/use Grid

services

Page 28: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Reality – What we found

• Tomcat or Test container– Differences in deployment, configuration, and

management– With Tomcat, increased potential for

centralised management, and sand-boxing of run-time environment

Page 29: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Reality – What we found

• Compilation issues on Solaris– Took longer than expected– Only Linux testing and support can be taken for

granted

Page 30: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Reality – What we found

• Exponential increase in testing complexity as number of nodes increases– Testing (and maintaining) interoperability

between m client machines, and n servers gets complicated.

– How well will this scale for 100s, 1000s of nodes?

Page 31: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Reality – Security

• In theory just had to– obtain (and update) host, client, and CA certificates

– convert

– install

– configure

– generate (and update) proxies.

• However, parts of “All Services” package also needed.

Page 32: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Security - What we found

• Interactions between security for multiple installations

• Essential to test non-secure interoperability first• Windows client-side security• Testing and viewing security configuration• Debugging secure calls• Client side security is programmatic• Security management scalability

– Construction and maintenance of user accounts and grid-map file entries.

Page 33: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Security - What we found

• Interactions between security for multiple installations– For testing may want

• multiple versions, or duplicates (with different configurations) of same versions.

• One container with no security, and another container with security

– May want test/production environments

Page 34: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Security - What we found

• Essential to test non-secure interoperability first– Trying to test interoperability and security

simultaneously wasn’t fun

Page 35: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Security - What we found

• Windows client-side security– Still havn’t got it working– Not obvious exactly what parts of Globus are

needed for client side code with security (no “client plus security” package).

Page 36: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Security - What we found

• Testing and viewing security configuration– Need to be able to view/edit and check security

configuration for containers and services– Confusion about hierarchical security settings

• Virtual Organisations, clusters, servers, containers, factories, services, methods, and instances.

– Remotely– Validate security deployment before run-time

Page 37: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Security - What we found

• Debugging secure calls (or any stateful service)– Proxy interceptor approach (e.g. TCPMON) won’t

work with stateful services• As grid handle returned to client contains the port number of

the instance, not the proxy

– But proxies are an important design pattern for SOAs…

– GT4/WS-RF may be different• Handle resolvers, WS-Addressing and WS-

RenewableReferences

Page 38: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Security - What we found

• Client side security is programmatic– Client side code modifications required to call

services/methods with required protocols– Should be declarative– Sensitive to server side security credentials

Page 39: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Security - What we found

• Security management scalability– Construction and maintenance of user accounts and grid-map file

entries.– For each server, each user needs an account, and an entry in the

container gridmap file (mapping client certificate to account)– May also need service specific gridmap files– Not scalable for large numbers of users, servers, services.

• Alternatives?– Tool support

– Role based authentication

– Shared accounts or certificates

Page 40: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Recommendations

• If Globus is middleware, then need:– Platform independent, automatic, installation.– Tool support for configuration and deployment

creation, validation, viewing and editing.– Management console for grid, nodes, globus

packages, containers and services.– Support for remote, location independent,

cross-organisational, multiple role scenarios.

Page 41: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Recommendations (continued)

• If Globus is middleware, then need:– Remote deployment and management of

services.– Remote distributed debugging of grid

installations, services, and applications.– Tool support, and more scalable processes for

security.

Page 42: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Alternatives

• Next we plan to evaluate the two architectural choices in more detail– Science exposed as services, vs science code managed

by higher level grid services.

• Explore alternative mechanisms for:– Load balancing and resource management

– Directory services (service and resource discovery)

– Data movement approaches (e.g. SOAP Attachments vs GridFTP)

Page 43: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Performance

• First approach (initial results)– Scientific benchmark (SciMark2.0) modified to measure

throughput, and invoked as a Stateful Grid Service– Metric is Calls Per Minute (CPM) – one unit of work.– No data movement, just computation and memory load.– JVM: 512MB Heap and –server (of course )

• Good performance and scalability– Security has minimal overhead– Problem with client side timeouts as response times

increase

Page 44: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Performance

ART (s)

0

50

100

150

200

0 10 20 30 40 50 60 70

Threads

Tim

e (

s)

UCL (4 cpu Sun)

Newcastle (2 cpu Intel)

Imperial (2 cpu Intel)

Edinburgh (4 hyperthread cpu Intel)

All

TomcatFastest: 3.6s (Edinburgh)Slowest: 25s (UCL)

Page 45: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Performance

Throughput (CPM)

0

10

20

30

40

50

60

70

80

0 20 40 60 80

Threads

CP

M

UCL (4 cpu Sun)

Newcastle (2 cpu Intel)

Imperial (2 cpu intel)

Edinburgh (4 hyperthread cpu Intel)

All (12 cpus)

Theoretical Maximum

95% of predicted maximum throughput

Page 46: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Performance

• Tomcat vs Test container– No difference on 3 out of 4 nodes

– But 67% faster on one node (Newcastle, slowest Intel box)

• Attachments will work with GT3 and Tomcat– But not with security

– Limit of 1GB (DIME)

– Bug in Axis – doesn’t clean up temporary files.

Page 47: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Performance

• Stateful instances can be problematic– Intermittent unreliability

• On some runs, 1 exception in 300 calls (reliability of .9967)– But non-repeatable, SOAP/network related?

• What is the safe response to exceptions? Can’t just retry.

– Possible to kill container (relies on clients being well behaved):

• By invoking same instance/method more than once.• By consuming container resources

– But instances can be passivated/activated in theory– Could be used to enable fine-grain (per instance) control over resource

usage.

Page 48: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Deployment

• How to install and configure Grid infrastructure and services - scalably and securely?

• Install GT3 infrastructure and security manually– MMJFS allows executable code to be staged

automatically (But not services - could provide a deployment service).

• Install bootstrapping code, and then install and deploy all other code and security automatically.– Using SmartFrog (HP) in the lab, and then test-bed.– Configuring GT3 security remotely is an open-issue, as is

“trust” with System Administrators.

Page 49: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Dreams - Debugging

• Debugging distributed systems is tricky– Need better support for cross-cutting non-functional concerns such

as deployment and debugging.– (One) problem with debugging services is not knowing the context

of errors (to aid diagnosis or cure) – a service is just an interface.

• Deployment aware debugging:– Starting from functional work-flows, generate deployment-flows,

which are executed prior to, or concurrent with, functional work-flows.

– If failure in functional work-flow, then corresponding deployment-flow is examined to determine likely causes, and parts are re-executed.

Page 50: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Grid Dreams - Debugging

• Backtrack through deployment steps (Like peeling an onion)– Some steps will need to be reversed– Track dependencies, and redundant operations.

• This approach may fix an (interesting) sub-class of problems:• Those which can be fixed by simply redoing (or replicating) (part of) the

installation, E.g.– Intermittent failure of container or services– Resource starvation or overload

• Security problems that can be fixed with reconfiguration or refresh of certificates/proxies.

– But not:• network, or all configuration and security/access problems.

Page 51: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

UK OGSA Evaluation Project

• Thank you – Questions/Comments?

• Email: [email protected]– After November: [email protected]

Page 52: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

UK OGSA Evaluation Project

• Thank you – Questions/Comments?

• Email: [email protected]– After November: [email protected]

• Not

Page 53: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

UK OGSA Evaluation Project

• Thank you – Questions/Comments?

• Email: [email protected]– After November: [email protected]

• Not (quite)

Page 54: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

UK OGSA Evaluation Project

• Thank you – Questions/Comments?

• Email: [email protected]– After November: [email protected]

• Not (quite) the

Page 55: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

UK OGSA Evaluation Project

• Thank you – Questions/Comments?

• Email: [email protected]– After November: [email protected]

• Not (quite) the End

Page 56: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

UK OGSA Evaluation Project

• Thank you – Questions/Comments?

• Email: [email protected]– After November: [email protected]

• Not (quite) the End…

Page 57: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Postscript – The Secret Life of Grid?

UK OGSA Evaluation Project Report 1.0

Evaluation of Globus Toolkit 3.2 (GT3.2) Installation

http://sse.cs.ucl.ac.uk/UK-OGSA/Report1.doc

Page 58: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Postscript – The Secret Life of Grid?

Our experiences Evaluating Grid technology reminds me of an Australian book (“The Secret Life of Wombats”) about a school boy who used to sneak out of his dormitory after everyone was asleep to go “wombatting”. He spent his nights secretly crawling down Wombat burrows with a flashlight – a potentially lethal activity (not just from cave-ins, as wombats are ferocious when cornered!) – and wrote copious notes resulting in a substantial increase in knowledge of these “mysterious and often misunderstood creatures”.

UK OGSA Evaluation Project Report 1.0

Evaluation of Globus Toolkit 3.2 (GT3.2) Installation

http://sse.cs.ucl.ac.uk/UK-OGSA/Report1.doc

Page 59: "Grid middleware is easy to install, configure, secure, debug and manage - across multiple sites"

Postscript – The Secret Life of Grid?

Our experiences Evaluating Grid technology reminds me of an Australian book (“The Secret Life of Wombats”) about a school boy who used to sneak out of his dormitory after everyone was asleep to go “wombatting”. He spent his nights secretly crawling down Wombat burrows with a flashlight – a potentially lethal activity (not just from cave-ins, as wombats are ferocious when cornered!) – and wrote copious notes resulting in a substantial increase in knowledge of these “mysterious and often misunderstood creatures”.

UK OGSA Evaluation Project Report 1.0

Evaluation of Globus Toolkit 3.2 (GT3.2) Installation

http://sse.cs.ucl.ac.uk/UK-OGSA/Report1.doc