Developing node-mdb: a Node.js - based clone of SimpleDB

41
Developing node-mdb SimpleDB emulation using Node.js and GT.M Rob Tweed M/Gateway Developments Ltd http://www.mgateway.com Twitter: @rtweed

description

Talk given at the London Ajax Users Group, June 14 2011

Transcript of Developing node-mdb: a Node.js - based clone of SimpleDB

Page 1: Developing node-mdb: a Node.js - based clone of SimpleDB

Developing node-mdb

SimpleDB emulationusing Node.js and GT.M

Rob TweedM/Gateway Developments Ltd

http://www.mgateway.comTwitter: @rtweed

Page 2: Developing node-mdb: a Node.js - based clone of SimpleDB

Could you translate that title?

• SimpleDB:– Amazon’s NoSQL cloud database

• Node.js:– evented server-side Javascript (using V8)

• GT.M:– Open source global-storage based NoSQL

database

• node-mdb– Open source emulation of SimpleDB

Page 3: Developing node-mdb: a Node.js - based clone of SimpleDB

SimpleDB

• Amazon’s cloud database– Pay as you go

• Secure HTTP interface• Schema-free NoSQL database• Spreadsheet-like database model

– Domains (= tables)• Items (= rows)

– Attributes (=cells)

» Values (1+ per attribute allowed)

• SQL-like query API

Page 4: Developing node-mdb: a Node.js - based clone of SimpleDB

Why emulate SimpleDB?

• Because I could!

• Kind of cool project

Page 5: Developing node-mdb: a Node.js - based clone of SimpleDB

Why emulate SimpleDB?

• To provide a free, locally-available database that behaved identically to SimpleDB– Lots of off-the-shelf available clients

• Standalone– Bolso

– Mindscape’s SimpleDB Management Tools

• Language-specific clients– boto (Python)

– Official AWS clients for Java, .Net

– Node.js

– etc…

Page 6: Developing node-mdb: a Node.js - based clone of SimpleDB

Why emulate SimpleDB?

• To perform local tests prior to committing to production on SimpleDB

• To provide a live, local backup database

• A SimpleDB database for private clouds

• To provide an immediately-consistent SimpleDB database– SimpleDB is “eventually consistent”

Page 7: Developing node-mdb: a Node.js - based clone of SimpleDB

Why the GT.M database?• I’m familiar with it• Free Open Source NoSQL database• Schema-free• “Globals”:

– Sparse persistent multi-dimensional arrays• Hierarchical database• Completely dynamic storage

– No pre-declaration or specification needed

• Result: trivial to model SimpleDB in globals

• node-mdb: Good way to demonstrate the capabilities of the otherwise little-known GT.M

• More info – Google:– “GT.M database”– “universalnosql”

Page 8: Developing node-mdb: a Node.js - based clone of SimpleDB

Why write it using Node.js?

• M/DB originally written in late 2008– Implemented using GT.M’s native scripting language

(M)– Apache + m_apache gateway to GT.M for HTTP

interface

• I’ve been working with Node.js for about a year now– Rewriting M/DB in Javascript would make it more

widely interesting and comprehensible

• Some performance issues reported with M/DB when being pushed hard

Page 9: Developing node-mdb: a Node.js - based clone of SimpleDB

Why Node.js?

• Conclusion:– Re-implementing M/DB using Node.js should

provide better performance and scalability– Fewer moving parts:

• Apache + m_apache + GT.M / multi-threaded• Node.js + GT.M as child processes / single-thread

– Cool Node.js project to attempt– Great example of non-trivial use of Node.js +

database

Page 10: Developing node-mdb: a Node.js - based clone of SimpleDB

How does SimpleDB work?

HTTPServer

AuthenticateRequest

(HMacSHA)

Security Key IdSecret Key

ExecuteAPI

Action

GenerateHTTP

Response

SimpleDBDatabaseCopy 1

SimpleDBDatabaseCopy 2

SimpleDBDatabaseCopy n

SimpleDBDatabaseCopy 2

SimpleDBDatabaseCopy 2

IncomingSDB

HTTPRequest

OutgoingSDB

HTTPResponse

Error Successand/or

data/results

Page 11: Developing node-mdb: a Node.js - based clone of SimpleDB

Node.js can emulate all this

HTTPServer

AuthenticateRequest

(HMacSHA)

Security Key IdSecret Key

ExecuteAPI

Action

GenerateHTTP

Response

SimpleDBDatabaseCopy 1

SimpleDBDatabaseCopy 2

SimpleDBDatabaseCopy n

SimpleDBDatabaseCopy 2

SimpleDBDatabaseCopy 2

IncomingSDB

HTTPRequest

OutgoingSDB

HTTPResponse

Error Successand/or

data/results

Page 12: Developing node-mdb: a Node.js - based clone of SimpleDB

GT.M can emulate this

HTTPServer

AuthenticateRequest

Security Key IdSecret Key

ExecuteAPI

Action

GenerateHTTP

Response

SimpleDBDatabaseCopy 1

IncomingSDB

HTTPRequest

OutgoingSDB

HTTPResponse

Error Successand/or

data/results

Page 13: Developing node-mdb: a Node.js - based clone of SimpleDB

Node.js characteristics

• Single threaded process

• Event loop

• Non-blocking I/O– Asynchronous calls to functions that handle I/O– Event-driven call-back functions when function

completes• Data fetched• Data saved

Page 14: Developing node-mdb: a Node.js - based clone of SimpleDB

Result: deeply nested call-backs

HTTPServer

AuthenticateRequest

Security Key IdSecret Key

ExecuteAPI

Action

GenerateHTTP

Response

Error Successand/or

data/results

Page 15: Developing node-mdb: a Node.js - based clone of SimpleDB

Flattening the call-back nesting

processSDBRequest()

http server

executeAPI() sendResponse()

http.createServer(function(req,res) {..}

var processSDBRequest = function() {…};

var executeAPI = function() {…};

Page 16: Developing node-mdb: a Node.js - based clone of SimpleDB

Node.js HTTP Serverhttp.createServer(function(request, response) { request.content = ''; request.on("data", function(chunk) { request.content += chunk; }); request.on("end", function(){ var SDB = {startTime: new Date().getTime(), request: request, response: response }; var urlObj = url.parse(request.url, true); if (request.method === 'POST') { SDB.nvps = parseContent(request.content); } else { SDB.nvps = urlObj.query; } var uri = urlObj.pathname; if ((uri.indexOf(sdbURLPattern) !== -1)||(uri.indexOf(mdbURLPattern) !== -1)) { processSDBRequest(SDB); } else { var uriString = 'http://' + request.headers.host + request.url; var error = {code:'InvalidURI', message: 'The URI ' + uriString + ' is not valid',status:400}; returnError(SDB ,error); } });}).listen(httpPort);

Page 17: Developing node-mdb: a Node.js - based clone of SimpleDB

processSDBRequest()var processSDBRequest = function(SDB) { var accessKeyId = SDB.nvps.AWSAccessKeyId; if (!accessKeyId) { var error = {code:'AuthMissingFailure', message: 'AWS was not able to authenticate the request: access credentials are missing',status:403}; returnError(SDB, error); } else { MDB.getGlobal('MDBUAF', ['keys', accessKeyId], function (error, results) { if (!error) { if (results.value !== '') { accessKey[accessKeyId] = results.value; validateSDBRequest(SDB, results.value); } else { var error = {code:'AuthMissingFailure', message: 'AWS was not able to authenticate the request: access credentials are missing',status:403}; returnError(SDB, error); } } }); }};

Page 18: Developing node-mdb: a Node.js - based clone of SimpleDB

validateSDBRequest()

var validateSDBRequest = function(SDB, secretKey) { var type = ‘HmacSHA256’; var stringToSign = createStringToSign(SDB, true); var hash = digest(stringToSign, secretKey, type); if (hash === SDB.nvps.Signature) { processSDBAction(SDB); } else { errorResponse('SignatureDoesNotMatch', SDB) }};

Page 19: Developing node-mdb: a Node.js - based clone of SimpleDB

stringToSign()

POST{lf}192.168.1.134:8081{lf}/{lf}AWSAccessKeyId=rob&Action=ListDomains& MaxNumberOfDomains=100&SignatureMethod=HmacSHA1& SignatureVersion=2& Timestamp=2011-06-06T22%3A39%3A30%2 B00%3A00& Version=2009-04-15

ie: reconstruct the same string that the SDB client used to sign the request

then use rob’s secret key to sign it:

Page 20: Developing node-mdb: a Node.js - based clone of SimpleDB

digest()

var crypto = require("crypto");

var digest = function(string, secretKey, type) { var hmac = crypto.createHmac(type, secretKey); hmac.update(string); return hmac.digest('base64');};

Page 21: Developing node-mdb: a Node.js - based clone of SimpleDB

Ready to execute an API!

HTTPServer

AuthenticateRequest

Security Key IdSecret Key

ExecuteAPI

Action

GenerateHTTP

Response

SimpleDBDatabaseCopy 1

SimpleDBDatabaseCopy 2

SimpleDBDatabaseCopy n

SimpleDBDatabaseCopy 2

SimpleDBDatabaseCopy 2

IncomingSDB

HTTPRequest

OutgoingSDB

HTTPResponse

Error Successand/or

data/results

Page 22: Developing node-mdb: a Node.js - based clone of SimpleDB

SimpleDB APIs (Actions)

• CreateDomain• ListDomains• DeleteDomain• PutAttributes (BatchPutAttributes)• GetAttributes• DeleteAttributes (BatchDeleteAttributes)• Select• DomainMetaData

Page 23: Developing node-mdb: a Node.js - based clone of SimpleDB

Accessing the GT.M Database

• Accessed via node-mwire– TCP-based wire protocol– Extension of Redis protocol– Adapted redis-node module

• APIs allow you to set/get/delete/edit Globals

Page 24: Developing node-mdb: a Node.js - based clone of SimpleDB

GT.M Globals

• Globals = unit of persistent storage– Schema-free– Hierarchically structured– Sparse– Dynamic

– “persistent associative array”

Page 25: Developing node-mdb: a Node.js - based clone of SimpleDB

GT.M Globals

• A Global has:– A name– 0, 1 or more subscripts– String value

globalName[subscript1,subscript2,..subscriptn]=value

Page 26: Developing node-mdb: a Node.js - based clone of SimpleDB

SDB Domain in GlobalsCreateDomain AWSAccessKeyId = ‘rob’ DomainName = ‘books’

Page 27: Developing node-mdb: a Node.js - based clone of SimpleDB

MDB ‘rob’

‘domains’

‘name’

‘domainIndex’

‘created’ 1304956337618

‘books’

‘modified’ 1304956337618

‘books’

1

1 ‘’

‘name’

‘created’ 1304956337423

‘accounts’

‘modified’ 1304956337423

2

‘accounts’ 2 ‘’

Multiple Domains in Globals

Page 28: Developing node-mdb: a Node.js - based clone of SimpleDB

MDB ‘rob’

‘domains’

‘name’

‘domainIndex’

‘created’ 1304956337618

‘books’

‘modified’ 1304956337618

‘books’

1

1 ‘’

2

Creating a new domain (1)

increment()

Page 29: Developing node-mdb: a Node.js - based clone of SimpleDB

MDB ‘rob’

‘domains’

‘name’

‘domainIndex’

‘created’ 1304956337618

‘books’

‘modified’ 1304956337618

‘books’

1

1 ‘’

‘name’

‘created’ 1304956337423

‘accounts’

‘modified’ 1304956337423

2

‘accounts’ 2 ‘’

Creating a new domain (2)

setGlobal()

Page 30: Developing node-mdb: a Node.js - based clone of SimpleDB

Key Node.js async patterns for db I/O

• Dependent pattern:– Can’t set the global nodes until the value of

the increment() is returned

• Parallel pattern:– Global nodes can be created in parallel– No interdependence– BUT:

• Need to know when they’re all completed

Page 31: Developing node-mdb: a Node.js - based clone of SimpleDB

MDB ‘rob’

‘domains’

‘name’

‘created’ 1304956337618

‘books’

‘modified’ 1304956337618

1

2

Dependent pattern

MDB.increment([accessKeyId, 'domains'], 1, function (error, results) { var id = results.value; //….now create the other global nodes inside callback});

IncrBy

Page 32: Developing node-mdb: a Node.js - based clone of SimpleDB

MDB ‘rob’

‘domains’

‘name’

‘created’ 1304956337618

‘books’

‘modified’ 1304956337618

1

2

Dependent pattern

MDB.increment([accessKeyId, 'domains'], 1, function (error, results) { var id = results.value; //….now create the other global nodes inside callback});

Page 33: Developing node-mdb: a Node.js - based clone of SimpleDB

Parallel Pattern (semaphore) var count = 0; MDB.setGlobal([accessKeyId, 'domains', id, 'name'], domainName, function (error, results) { count++; if (count === 4) sendCreateDomainResponse(count, SDB); }); MDB.setGlobal([accessKeyId, 'domains', id, 'created'], now, function (error, results) { count++;

if (count === 4) sendCreateDomainResponse(count, SDB); }); MDB.setGlobal([accessKeyId, 'domains', id, 'modified'], now, function (error, results) { count++; if (count === 4) sendCreateDomainResponse(count, SDB); }); MDB.setGlobal([accessKeyId, 'domainIndex', nameIndex, id], '', function (error, results) { count++; if (count === 4) sendCreateDomainResponse(count, SDB); });

Page 34: Developing node-mdb: a Node.js - based clone of SimpleDB

MDB ‘rob’

‘domains’

‘name’

‘domainIndex’

‘created’ 1304956337618

‘books’

‘modified’ 1304956337618

‘books’

1

1 ‘’

‘name’

‘created’ 1304956337423

‘accounts’

‘modified’ 1304956337423

2

‘accounts’ 2 ‘’

New domain nodes created

Page 35: Developing node-mdb: a Node.js - based clone of SimpleDB

Send CreateDomain Response

HTTPServer

AuthenticateRequest

Security Key IdSecret Key

ExecuteAPI

Action

GenerateHTTP

Response

SimpleDBDatabaseCopy 1

SimpleDBDatabaseCopy 2

SimpleDBDatabaseCopy n

SimpleDBDatabaseCopy 2

SimpleDBDatabaseCopy 2

IncomingSDB

HTTPRequest

OutgoingSDB

HTTPResponse

Error Successand/or

data/results

Page 36: Developing node-mdb: a Node.js - based clone of SimpleDB

CreateDomain Response<?xml version="1.0"?><CreateDomainResponse xmlns="http://sdb.amazonaws.com/doc/2009-04-15/"> <ResponseMetadata> <RequestID>e4e9fa45-f9dc-4e5b-8f0a-777acce6505e</RequestID> <BoxUsage>0.0020000000</BoxUsage> </ResponseMetadata></CreateDomainResponse>

var okResponse = function(SDB) { var nvps = SDB.nvps; var xml = responseStart({action: nvps.Action, version: nvps.Version}); xml = xml + responseEnd(nvps.Action, SDB.startTime, false); responseHeader(200, SDB.response); SDB.response.write(xml); SDB.response.end();};

Page 37: Developing node-mdb: a Node.js - based clone of SimpleDB

Node.js HTTP Server Response

http.createServer(function(request, response) { //…numerous call-backs deep:

response.writeHead(status, { "Server": "Amazon SimpleDB", "Content-Type": "text/xml", "Date": dateNow.toUTCString()}); response.write('<?xml version="1.0"?>\n'); response.write(xml); response.end();

});

Entire request/response SDB round-trip completed

Page 38: Developing node-mdb: a Node.js - based clone of SimpleDB

Demo using Bolso

• List Domains

• Create Domain

• Add an item (row) and some attributes (columns + cells)

Page 39: Developing node-mdb: a Node.js - based clone of SimpleDB

Node.js Gotchas

• Async programming is not immediately intuitive!

• Loops– Calling functions that use call-backs inside a

for..in loop will go horribly wrong!

• Understanding closures– How externally-defined variables can be used

inside call-back functions

Page 40: Developing node-mdb: a Node.js - based clone of SimpleDB

Example

• BatchPutAttributes– Intuitively a for .. in loop around PutAttributes– Had to be serialised

• Completion of one PutAttributes calls the next

– Copy state of SDB object and use for..in?• var SDBx = SDB;• SDBx is a pointer to SDB, not a clone of it!

Page 41: Developing node-mdb: a Node.js - based clone of SimpleDB

Conclusions• node-mdb is now nearly complete• Only BatchDeleteAttributes not implemented• Other APIs emulate SimpleDB 100%• Free Open Source

– https://github.com/robtweed/node-mdb– Give it a try!– Use mdb.js for examples to build your own Node.js database

applications• Check out GT.M!

• Follow me on Twitter at @rtweed

• Slides: http://www.mgateway.com/node-mdb-pres.html