Ditching the Middleware

28
Ditching the Middleware David Wood 3 Round Stones [email protected] NoSQL Now! 2011

description

Since March 1998, the U.S. Government Printing Office has used persistent uniform resource locators (PURLs) to provide libraries and other parties stable URLs to online Federal information. The existing architecture was designed in 2009 as a three-tier architecture using a presentation tier, a RESTful XML application tier, and a SQL database tier. The current service was designed as a centralized system, which makes it difficult for other organizations to contribute and replicate the service. Furthermore, the rigidity of the architecture makes improvements expensive. Therefore, a new MVC data-oriented architecture was designed for quick reads and a flexible UI. The view layer has direct access to the data model, making improvements to UI and model quick and cheap. Come and see how adopting the architecture of the open source project Callimachus reduced development and maintenance costs while improving overall scalability and throughput.

Transcript of Ditching the Middleware

Page 1: Ditching the Middleware

Ditching the Middleware

David Wood3 Round Stones

[email protected]

NoSQL Now! 2011

Page 2: Ditching the Middleware

#nosql, like organic food, is defined by what it doesn't contain.

Comparing noSQL solutions is like comparing organic apples and

oranges...

Eric van der Vlist@evlist

25 July 2011

Page 3: Ditching the Middleware

Agenda

• An architecture for communal, horizontal scaling

• Efficiency improvements over other architectures

• Multi-party participation

Page 4: Ditching the Middleware

Three Tier Architecturel Separable tiers; easy to replace

l Separation allows for load balancing of tiers

l Secure database independently

Data Access Business Logic Presentation Logic

Page 5: Ditching the Middleware

Data model changes effect all tiers:• your presentation• your processing model• your data schema.

Page 6: Ditching the Middleware
Page 7: Ditching the Middleware

Scaling OptionsVertical

Horizontal

Page 8: Ditching the Middleware

Vertical Scaling Limits

• July 2011 (changes constantly):

• Amazon EC2: 68 GB RAM

• Rackspace Cloud: 16 GB RAM

• Custom build: M5000 with 512GB RAM

(CPU and disk harder to quantify/compare)

Page 9: Ditching the Middleware

Horizontal Scaling Limits

vs.

Data throughput Optimistic reads?

Page 10: Ditching the Middleware

ChoicesData Access

Business Logic

Presentation Logic

View Model

Controller

Page 11: Ditching the Middleware

Choices

View Model

Controller

< >Objects

Page 12: Ditching the Middleware

l Open Source community project to build a Linked Data Management System

l Visit callimachusproject.org

Callimachus

Page 13: Ditching the Middleware

HTTP GETrequest

Class

Viewable

XMLtemplate apply.xsl

HTML

SPAR

QL

quer

y

RD

F re

spon

se

HTTPresponse

Web serverRDF Store

Resource

Controller

TemplateEngine

Page 14: Ditching the Middleware

Data-View-Controller

ModelView

Controller

< >

Page 15: Ditching the Middleware

View Templates

• Written in XHTML+RDFa (declarative pattern);

• Parsed to create SPARQL queries;

• Query results are filled into the same template.

Page 16: Ditching the Middleware
Page 17: Ditching the Middleware

RDFa queries

Results fill into template

Page 18: Ditching the Middleware

Some (Single System) Advantages

• Minimal abstraction between interface and data (speed, fewer points of failure)

• Minimized queries per page: O(1), not O(n)

• Maximize caching opportunities due to small data elements

• Native optimistic reads

Page 19: Ditching the Middleware

Some (Multi System) Advantages

• Each node has its own store (reduced contention)

• Mirroring of data segmentations for write control

• Theoretical ability to combine data from multiple writes; eventual consistency (RDF as fifth normal form with sufficient provenance information; currently unimplemented)

Page 20: Ditching the Middleware

Case Study:Persistent URLs

Page 21: Ditching the Middleware

Persistent URLs

• Redirection service via HTTP 3xx redirects

• purl.org operated by OCLC since 1995

• Supposed to be “persistent”PURL

Service

WebService

HTTP GET

302 (Found)

HTTP GET

200 (OK)

client

e.g.

Page 22: Ditching the Middleware

Persistent URLs at US GPO

Data Master

Data Slave

DynamicDNS

Normal Ops

Failover /Spillover

Page 23: Ditching the Middleware

PURL Federation

DynamicDNS

Normal Opsfor A.org

e.g. Failover /Spilloverfor A.org

Write-Segmenteddata source

PURL Servicesby different parties

A.org

B.com

C.net

Page 24: Ditching the Middleware

PURL Federation

DynamicDNS

Normal Opsfor C.net

to any host

PURL Servicesby different parties

A.org

B.com

C.net

Page 25: Ditching the Middleware

Callimachuscallimachusproject.org

Persistent URLspurlz.org

Page 26: Ditching the Middleware

Thanks to:

• http://www.flickr.com/photos/vialetter/254383430/

• Callimachus and PURL contributors and sponsors, especially James Leigh

Page 27: Ditching the Middleware

This work is Copyright © 2011 3 Round Stones Inc.It is licensed under the Creative Commons Attribution 3.0 Unported LicenseFull details at: http://creativecommons.org/licenses/by/3.0/

You are free:

to Share — to copy, distribute and transmit the work

to Remix — to adapt the work

Under the following conditions:

Attribution. You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work).

• For any reuse or distribution, you must make clear to others the license terms of this work.

• Any of the above conditions can be waived if you get permission from the copyright holder.

• Nothing in this license impairs or restricts the author's moral rights.

• Some Content in the work may be licensed under different terms, this is noted separately.