Distributed Services Part One Lecture on Prof. Walter Kriha Computer Science and Media Faculty HdM...

Distributed ServicesPart One

Lecture on

Prof. Walter Kriha

Computer Science and Media Faculty

HdM Stuttgart

Overview

• Distributed Services at Amazon and Google

• Distributed Services (Naming, Finding, Live Cycle, Relationships, Notification, Concurrency)

• Functional and Non-Functional Aspects of DS

• Services, Lose Coupling and Aspects

• Remote Locking Strategies

What is a Distributed Service?

A distributed middleware that provides a certain function with:

- high scalability

- high availability through redundant parts

- high throughput und performance

To Distributed Applications

Distributed Services at Amazon

“The big architectural change that Amazon went through in the past five years was to move from a two-tier monolith to a fully-distributed, decentralized, services platform serving many different applications. “

Werner Vogels, Amazon CTO,

At the core of the Amazon strategy are the Web Services. The Amazon team takes the concepts of search, storage, lookup and management the data and turns them into pay-per-fetch and pay-per-space web services. This is a brilliant strategy and Amazon is certainly a visionary company. But what impresses me the most as an engineer is their ability to take very complex problems, solve them and then shrink wrap their solutions with a simple and elegant API. Alex Iskold, SOAWorld Mag.

Amazon Service Arc.

A.Iskold, SOAWorld

Amazon Service Arc.

http://www.infoq.com/news/2009/02/Cloud-Architectures

Amazon Client Facing Services

• Storage Services (S3, huge storage capacity)

• Computation Services (EC2, Virtual Machines)

• Queuing Services

• Load-balancing services

• Elastic map reduce

• Cloudfront (memcache like fast cache)

• And so on....

Services: Best Practice• Keep services independent (see Playfish: a game is a

service)

• Measure services

• Define SLAs for services

• Allow agile development of services

• Allow hundreds of services and aggregate them on special servers

• Stay away from middleware and frameworks which force their patterns on you

• Keep teams small and organized around services

• Manage dependencies carefully

• Create APIs to offer your services to customers

more: www.highscalability.com, Inverview with W.Vogels

http://www.highscalability.com/

Distributed Services at Google

- Scheduling Service

- Map/Reduce Execution Environment

- F1 NewSQL DB

- Spanner Replicated DB

- BigTable NoSQL Storage

- Distributed File System

- Chubby Lock Service

Example: GFS

Meta-data server

Meta-data server

Processor blade

Processor blade

Processor blade

clientclientclientclient

client

What are the non-functionals for a distributed file system?

Core Distributed Services

- Finding Things (Name Service, Registry, Search)- Storing Things (all sorts of DBs, data grids, block storage etc.)- Events Handling and asynchronous processing (Queues)- Locking Things and preventing concurrent access (Lock service)- Request scheduling and control (request (de)-multiplexing- Time handling- Providing atomic transactions- Replicating Things- Object handling (Lifecycle services for creation, destruction, relationship service)Etc.

Frequently Used Implementation Pattern

Consensus Protocol withFailure Detection

Master

Replicating nodes

N+1 or N+2 redundant cluster offering a specific service in a highly available way. Alternatives are totally distributed services with no master.

Client

Distributed Services Hierarchy

Failure Models

Time and Causality Handling, reliable Communication

Consensus/Leadership Algorithms

Lock Service, Time Service

Distributed File Services

Distributed Table Storage Services

Scheduler Security

Global Replication Services

Applications (Search etc.)

Cycles?

Query

Finding distributed objects : name space design

company

channels

testproduction

topics queues

Factoryfinder

development

factory

channels

topics queues

Factoryfinder

Factory:/Development/factory

channels

topics queues

Factoryfinder

factory

The organization of a company name space is a design and architecture issue. System Management will typically enforce rules and policies and control changes. Different machines will host parts of the name space (zones)

Hard link alias

Soft link alias

Zone=name server

(Non)-functional requirements

The different views of a system are also called „aspects“ today. Aspect-oriented programming (AOP) tries to keep them conceptually separate but „weaves“ them together at or before runtime execution.

Definition: the functions of a system that serve e.g. the business DIRECTLY are called “functional requirements”

“Non-Functional Requirements”: the functions of a system that are necessary to achieve speed, reliability, availability, security etc.

Many programs fulfill the functional requirements but die on the non-functional requirements.

functional requirements of a naming service

- re-bind/resolve methods to store name/value pairs.

- Provide query interface

-Support for name aliases to allow several logical hierarchies (name space forms a DAG)

-Support for composite names (“path names”)

-Support location – independence of resources by separating address and location of objects.

A naming service offers access to many objects based on names. This requires a sound security system to be in place. Is it a clever system design to offer lots of services/objects only to require a grand access control scheme?

Non-functional requirements of a name service-Persistent name/value combinations (don’t lose mapping in a crash)

-Transacted: Manipulation of a naming service often involves several entries (e.g. during application installation) This needs to happen in an all-or-nothing way.

-Federated Naming Services: transparently combine different naming zones into one logical name service.

-Fault-Tolerance: Use replication to guarantee availability on different levels of the name space. Avoid inconsistencies caused by caching and replication.

-Speed: Guarantee fast look-up through clustering etc. Writes can be slower. Client side caching support. Reduce communication costs.

A naming service is a core component of every distributed system. It is easily a Single-point-of-failure able to stop the whole system.

A fault-tolerant JNDI name service I

From: Wang Yu, uncover the hood of J2EE Clustering, http://www.theserverside.com/tt/articles/article.tss?l=J2EEClustering

A fault-tolerant JNDI name service II

From: Wang Yu, uncover the hood of J2EE Clustering, http://www.theserverside.com/tt/articles/article.tss?l=J2EEClustering

Name Space Distribution (Federation)

An example partitioning of the DNS name space, including Internet-accessible files, into three layers.From: van Steen/Tanenbaum, Chapter on Naming. Federation is when systems communicate ABOUT their data instead of just replicating/copying everything.

Speed!

Availability!

Iterative lookup,

Caching policy A

Recursive lookup

Caching policy B

Company level

Examples of Naming Services

• Domain Name System (DNS)• X.500 Directory• Lightweight Directory Access Protocol (LDAP)• CORBA Naming Service• Java Registry• J2EE JNDI (mapped to CORBA Naming Service)

Finding distributed objects : no service case

buyer

Seller A:Beans, green

Seller BBeans, yellow

Seller CApples, green

Find, query offers and make deal

Without intermediate helpers a client needs to FIND the proper server, QUERY the offer and DEAL with a seller. This is tedious!

Finding distributed objects : directory service

buyer


Seller BBohnen, gelb


query offers and find dealer

A directory allows a buyer to find products and dealers quickly. Still, the buyer needs to understand the offer(s) – possibly all different with respect to language and quality-of-service – and deal with a selected dealer

Directory:Beans,green, Seller ABeans, yellow, Seller BApples,green, Seller C

buyerSeller B

Bohnen, gelb

Make deal with dealer

Register products

Finding distributed objects : trading service

buyer


Seller BBohnen, gelb


Send query to trader

A trader takes a service specification from an “importer” and matches it against service offers that “exporters” have registered. The trader offers a service definition language to formulate queries and offers. An offer can be changeable or not. Trader can also translate between different languages and offer a constraint definition language. But a trader will NOT perform the deal itself (like a broker does). Federation problems: duplicated offers and looping requests.

Trader: match query Against offers. Return Matching offer.Offer AOffer BOffer C

Bohnen,gelb

Register, withdraw or change offers

More TradersFederated traders

Finding distributed objects: query service

client query Query processor

Query definition language (e.g. SQL, OQL, XQL)

result

1 Kriha, 10.000 2 Maier, 15000

Find all accounts with name “maier” and balance <200)

A characteristic of a query is that we expect to get ALL the information back that matches our constraints. And we can use expressions that will be evaluated by the storage component. Compare this with “search” on the next slide.

database

Problems of object query services

client queryQuery

processor

Object query language

1 Kriha, 10.000 2 Maier, 15000

Find all person objects with person.name()==“maier” and account.balance()>200)

-will the query processor INSTANTIATE all objects to perform the expressions person.name() and account.balance? This is VERY SLOW. Fix: “query push-down” (translate OQL into SQL and let the DB do its job). This requires the query processor to know which methods are side-effect free!

-Can the results be also in DATA record form or must it be objects? Again, performance is better with data. But then: How is security preserved in the data case? (assuming an J2EE method level security in place?). Be sure to define your security on a meta-level and derive both data and object level security from it.

-Can lazy evaluation used with result object creation (e.g. only when really used by the client)

-How long are results VALID (cursor stability) after transactions?

Result objects

Discovery Services and Semantics

The dream of discovery is that clients and providers are “loosely coupled” – clients can evaluate services and decide which ones they will use. This core paradigm of WebServices and Service-Oriented Architectures (SOA) requires much more than interface specifications. It requires semantic descriptions and ontologies so that clients and services can really understand each other.

Do we need translator services which map different vocabularies?

See: Lecture on WebServices and SOA later...

Functional requirements of a lifecycle service

• Help in the creation of objects

• Avoid that clients know object creation details

• Help copying objects (across machines? Different Hardware?)

• Help migrate objects without breaking clients

• Help delete objects if clients do not use them anymore.

Creating distributed objects : lifecycle service

clientNamingService

client

client

AccountFinder

AccountFactory

client Account

Resolve(/test/finders/AccountFinder)

getAccountFactory()

createAccount(4711, Kriha, 0)

Credit(100.000)

Using interfaces (naming service, factoryfinder, factory, account) allows system administrators to hide different versions or implementations from clients.

In a remote environment finders also support migration and copy of objects.

Factory finders and factories

company

channels

testproduction

topics queues

Factoryfinder

development

factory

MyAccountFactory

MyAcctFinder

Factoryfinder

factory

channels

topics queues

Factoryfinder

factory

Proper configuration of finders and factories allows the creation of individual object nets.

Protect distributed objects : relationships

“Person” and “Preferences”. If person is deleted, the preferences should go too.

Referential integrity rules

person

preferences

“has” relationship

Within the database referential integrity rules protect e.g. containment relationships. We’ve got nothing like this in object space.

Functional Requirements for a Relationship Service

• Allow definition of relations between objects without modifying those objects

• Allow different types of relations

• Allow graphs of relations

• Allow the traversal of relationship graphs

• Support reference and containment relations

More on relationships: W.Emmerich, Engineering Distributed Objects

Relationships: employee leaves company

employee

Mail system: MailsHost:

Passwords

Host:Authorizations

(Internet Access etc.)

Host:Authentications

File Server:Disk Space Host:

Passwords

Appliacations:passwords

House Security:Door access rights

Human Resource DB

Can you make sure that once an employee leaves, ALL her rights are cancelled, her disc-space archived and erased, the databases for authentication etc. are updated, application specific DBs as well? And last but not least that the badge does no longer work? That all equipment has been returned? This are RELATIONS between an employee and resources which need to be expressed in a machine readable form.

Relationship creation

Object AReference to B

Object B Programmed relationship

Node

Node

Relationtype

Role A

Role B

Dynamic relationship

Object A

Object B

The objects A and B are not aware of any relations defined on them.

Relationships: the good and the bad

• Powerful modeling tool for meta-information (e.g. topic-maps)

• Helps with creation, migration, copy and deletion of composite objects and maintains referential integrity

The good:

• Tends to create many and small server objects

• Performance killer (many CORBA vendors did not implement this service for a long time). EJB: supported with local objects only (in same container)

The bad:

Local Notification: Observer Pattern

Observed:

Public Register(Observer, Event)

Private Notify() {

For each observer in List, call observer.update(Event)

}

Observer A:

Update(Event) {}

Observer B:

Update(Event) {}

Register

Receive updates

Even in the local case these observer implementations have problems: Updates are sent on one thread. If an observer does not return, the whole mechanism stops. If an observer calls back to the observed DURING an update call, a deadlock is easily created. The solution does not scale and is not reliable (if the observer crashes, all registrations are lost). And of course, it does not work remotely because the observer addresses are local.

Distributed Notifications and Events

Various combinations of push and pull models are possible. Receivers can install filters using a constraint language to filter content. This reduces unwanted notifications.

SenderOn machine

A

ReceiverOn machine

B

Topic B

Topic A

Topic C

Notification or event channel

push

pull

pull

push

filters

Machine X

Protect distributed objects : concurrency service

Class counter {

Int count = 0;

Increment() {

Int temp = count;

Temp++;

Count= temp;}

Client A

Client B

Client A calls increment(). Count is 0, temp becomes 0.

Client A’s thread has used it’s slice and is preempted.

Client B calls increment(). Count is 0, temp becomes 0.

Client B adds 1 to temp and writes it back to count. Count is 1.

Now comes Client A again. Also adds 1 to temp and writes it back to count. Count is 1 and NOT 2 now. We’ve lost one update.

The lost update problem!

Protect distributed objects : lost updates

Class counter {

Int count = 0;

Increment() {

Int temp = count;

Temp++;

Count= temp;}

Client A

Client B

Client A calls increment(). Count is 0, temp becomes 0.


Client B calls increment(). Count is 0, temp becomes 0.

Client B adds 1 to temp and writes it back to count. Count is 1.

Now comes Client A again. Also adds 1 to temp and writes it back to count. Count is 1 and NOT 2 now. We’ve lost one update.

The lost update problem! Would it help to use count++ ?

Protect distributed objects : inconsistent analysis

Class counter {

Int count1 = 1;

Int count2 = 0;

Swap() {

Int temp = count1;

Count1=count2;

Count2=temp;}

Int sum() {

Return count1 + count2; }

}

Client A

Client B

Client A calls swap(). After storing count1 in temp it is set to count2 (0).


Client B calls sum(). Count1 is now 0, and count2 is still 0.

Client B comes back from sum() with result 0.

Now comes Client A again. Writes temp back to count2 . Count2 is now 1 but sum() has reported 0 for both. The analysis of sum() is wrong.

The inconsistent analysis problem!

Use of locking against concurrent access

• Binary locks: e.g. synchronize(object). Will block all clients except of one.

• Modal locks (read lock, write lock): Clients who only want to read can get read locks – many concurrent read locks are possible.

Binary locks are very simple to use but performance suffers badly because they cannot distinguish between reads and writes.

Lock compatibility matrix

OK NO

NO NO

Read lock

Read lock

Write lock

Write lock

The concurrency service will not allow concurrent locks other than read locks. A write lock will exclude all other locks.

Lock granularity

databaseLock whole DB

Lock table

Lock row

Besides lock mode the granularity of locks will determine overall throughput. The smaller the better.

Optimistic Locking

data Timestamp=1122

databaseLock row, read it (with timestamp), release lock

Overall throughput is better because locks are held only a very short time. The timestamp compare logic should be a framework mechanism of the client session objects.

Use row data copy

Write to DB. For all data read acquire locks and compare the timestamps. If one is newer, abort store operation

Client session

Two-phase locking

clientresource

resource

resource

resource

Allocate all locks

client

client

Manipulate data

release all locks

A basic requirement for the 2-phase locking protocol is that all locks are allocated first. After the first lock is released NO other locks may be acquired! This will guarantee serializability

Deadlocks

Client AResource A

Resource B

Allocate lock for A

Client A

Deadlocks can be detected (e.g. by a database). To prevent deadlocks, always allocate resource locks IN THE SAME ORDER. Process termination must release all locks held by a process.

Client B

Client B

Allocate lock for B

Try to allocate lock for B: not granted, held by client B

Try to allocate lock for A: not granted, held by client A

Service Context

Some services need so-called context information to flow with a call. Two prominent ones are:

a) Security (needs to “flow” user information, access rights etc.)

b) Transactions (need to flow information about on-going transactions to participants)

This additional information needs to be standardized if different vendor implementations of services should interoperate.

Do you know other “context related” design problems?

Next Session:

• Persistence

• Transactions

• Security

•Gray/Reuter, Transaction Processing (Chapter on Transaction Models)

•Java Data Objects Specification (explains on the first 20 pages the concepts of identity, copy, transactional state and develops a state diagram for objects (stateless vs. statefull, transacted etc.)

•Security FAQ on www.faq.org (?), primer on www.rsa.com

• Silvano Maffeis, ELEKTRA, a high-available CORBA ORB

http://www.rsa.com/

Exercises

• Map the services to the middleware of your project: How do p2p systems find objects, persist objects etc.?

• Why is “mandandentfähigkeit” such a hard problem?

• What are the non-functional requirements of a notification service?

• Why would count++ not help to prevent the lost update problem?

• Why are read locks necessary?

Resources

• Understanding LDAP, www.redbooks.ibm.com (free)

• www.io.de distributed web search paper

• Van Steen/Tanenbaum, Chapter on Naming

• Van Steen/Tanenbaum, Chapter on Security (homework for next session)

• Martin Fowler et al., UML Distilled (small and nice)

Werner Vogels, Amazon Architecure, http://queue.acm.org/detail.cfm?id=1142065

Alex Iskold, SOAWorld on Amazon – the real Web Services Comp. http://soa.sys-con.com/node/262024

http://queue.acm.org/detail.cfm?id=1142065

http://soa.sys-con.com/node/262024

Distributed Services Part One Lecture on Prof. Walter Kriha Computer Science and Media Faculty HdM...

Documents

Transcript of Distributed Services Part One Lecture on Prof. Walter Kriha Computer Science and Media Faculty HdM...