Tin — a Database Engine so Tiny, its Name had to be Shortened

81
A database engine so tiny, its name had to be shortened /taɪn/ Tin

description

The slides for my presentation at no:sql(east) in Atlanta on October 29, 2009. Video is at http://nosqleast.com/2009/#speaker/anglade Code will be at http://github.com/timanglade/tin

Transcript of Tin — a Database Engine so Tiny, its Name had to be Shortened

Page 1: Tin — a Database Engine so Tiny, its Name had to be Shortened

A database engine so tiny,its name had to be shortened

/taɪn/Tin

Page 2: Tin — a Database Engine so Tiny, its Name had to be Shortened
Page 3: Tin — a Database Engine so Tiny, its Name had to be Shortened

I’m Tim

Page 4: Tin — a Database Engine so Tiny, its Name had to be Shortened

I’m TimNice tomeet you

Page 5: Tin — a Database Engine so Tiny, its Name had to be Shortened
Page 6: Tin — a Database Engine so Tiny, its Name had to be Shortened

Because Geekshave girlfriends too!

Page 7: Tin — a Database Engine so Tiny, its Name had to be Shortened
Page 8: Tin — a Database Engine so Tiny, its Name had to be Shortened

</shameless-self-plug>

Page 9: Tin — a Database Engine so Tiny, its Name had to be Shortened

THISISSLIDE 1

Page 10: Tin — a Database Engine so Tiny, its Name had to be Shortened

THISISSLIDE 1

Hi.

Page 11: Tin — a Database Engine so Tiny, its Name had to be Shortened

THISISSLIDE 1

Hi.Hat-tip to

@steadicat

Page 12: Tin — a Database Engine so Tiny, its Name had to be Shortened

SequentialData(1D)

Page 13: Tin — a Database Engine so Tiny, its Name had to be Shortened

i.e.

Page 14: Tin — a Database Engine so Tiny, its Name had to be Shortened

i.e. stocks

Page 15: Tin — a Database Engine so Tiny, its Name had to be Shortened

i.e. stocksbank trans.

Page 16: Tin — a Database Engine so Tiny, its Name had to be Shortened

i.e. stocksbank trans.sensor data

Page 17: Tin — a Database Engine so Tiny, its Name had to be Shortened

i.e. stocksbank trans.sensor datatwitter feeds

Page 18: Tin — a Database Engine so Tiny, its Name had to be Shortened

i.e. stocksbank trans.sensor datatwitter feedsfacebook walls

Page 19: Tin — a Database Engine so Tiny, its Name had to be Shortened

SSD DO NOT WANTI CAN HAZ HDD?

Page 20: Tin — a Database Engine so Tiny, its Name had to be Shortened

The Origins

Page 21: Tin — a Database Engine so Tiny, its Name had to be Shortened

Once upon a time,at a majorstock exchange ...

Page 22: Tin — a Database Engine so Tiny, its Name had to be Shortened

Oracle& SQL Serverto the limit

Page 23: Tin — a Database Engine so Tiny, its Name had to be Shortened

\m/ \m/

Oracle& SQL Serverto the limit

Page 24: Tin — a Database Engine so Tiny, its Name had to be Shortened

And even then ...

Page 25: Tin — a Database Engine so Tiny, its Name had to be Shortened

So…

Page 26: Tin — a Database Engine so Tiny, its Name had to be Shortened

Disk is cheap.

Page 27: Tin — a Database Engine so Tiny, its Name had to be Shortened

.CPU & RAMare costly

Page 28: Tin — a Database Engine so Tiny, its Name had to be Shortened

( drumroll… )

Page 29: Tin — a Database Engine so Tiny, its Name had to be Shortened

&Text

Filesystem

Page 30: Tin — a Database Engine so Tiny, its Name had to be Shortened

?Where did

the hunch come from

Page 31: Tin — a Database Engine so Tiny, its Name had to be Shortened

IndustrialResearch

—›+Stock data in glorified CSVs

Arrays of text files

(elsewhere)

Page 32: Tin — a Database Engine so Tiny, its Name had to be Shortened

&Text

Filesystem

Page 33: Tin — a Database Engine so Tiny, its Name had to be Shortened
Page 34: Tin — a Database Engine so Tiny, its Name had to be Shortened
Page 35: Tin — a Database Engine so Tiny, its Name had to be Shortened

1. Output to CSV 0932, GOOG, 750, 698

Page 36: Tin — a Database Engine so Tiny, its Name had to be Shortened

1. Output to CSV 0932, GOOG, 750, 698

2. Upload to S3 s3cmd.rb put GOOG-20091029.txt

Page 37: Tin — a Database Engine so Tiny, its Name had to be Shortened

1. Output to CSV 0932, GOOG, 750, 698

2. Upload to S3 s3cmd.rb put GOOG-20091029.txt

3. Serve over HTTP http://stocks.com/GOOG-20091029.txt

Page 38: Tin — a Database Engine so Tiny, its Name had to be Shortened
Page 39: Tin — a Database Engine so Tiny, its Name had to be Shortened

4. ??? [client: discard lines, replay log, etc.]

Page 40: Tin — a Database Engine so Tiny, its Name had to be Shortened

4. ??? [client: discard lines, replay log, etc.]

5. Profit! Pay your AWS charge. Compare with Oracle costs. Laugh while you count your bills.

Page 41: Tin — a Database Engine so Tiny, its Name had to be Shortened

WARNINGWARNINGWARNINGWARNING

Page 42: Tin — a Database Engine so Tiny, its Name had to be Shortened

This is

Page 43: Tin — a Database Engine so Tiny, its Name had to be Shortened

This isSTUPID.

Page 44: Tin — a Database Engine so Tiny, its Name had to be Shortened

The rest:A little less so.

Page 45: Tin — a Database Engine so Tiny, its Name had to be Shortened

The rest:A little less so.

But still…

Page 46: Tin — a Database Engine so Tiny, its Name had to be Shortened

?So why

am I here

Page 47: Tin — a Database Engine so Tiny, its Name had to be Shortened

Is anybody else doing this?

Page 48: Tin — a Database Engine so Tiny, its Name had to be Shortened

The Specifics

Page 49: Tin — a Database Engine so Tiny, its Name had to be Shortened

MAIN CONCEPTSSHARDING

PSEUDO-INDEXES

TRIAGE MODEOFFSETTINGREDUNDANCYLOG REPLAY

Page 50: Tin — a Database Engine so Tiny, its Name had to be Shortened

.Sharding

Page 51: Tin — a Database Engine so Tiny, its Name had to be Shortened

0932, GOOG, 750, 6700933, AMZN, 240, 2300939, ADBE, 130, 1200943, GOOG, 749, 677

0939, ADBE, 130, 120

0933, AMZN, 240, 230

0932, GOOG, 750, 6700943, GOOG, 749, 677

AMZN.txt

ADBE.txt

GOOG.txt

Page 52: Tin — a Database Engine so Tiny, its Name had to be Shortened

0939, ADBE, 130, 120

0933, AMZN, 240, 230

0932, GOOG, 750, 670

AMZN-093X.txt

ADBE-093X.txt

GOOG-093X.txt

0943, GOOG, 750, 670GOOG-094X.txt

0932, GOOG, 750, 6700933, AMZN, 240, 2300939, ADBE, 130, 1200943, GOOG, 749, 677

Page 53: Tin — a Database Engine so Tiny, its Name had to be Shortened

Directories are for deletion

Too large a file hinders redundancy

Two Laws

Page 54: Tin — a Database Engine so Tiny, its Name had to be Shortened

.Queries

Page 55: Tin — a Database Engine so Tiny, its Name had to be Shortened

Skip the middleman.

Page 56: Tin — a Database Engine so Tiny, its Name had to be Shortened

REST is the interfaceSkip the middlemanSkip the middleman.

Page 57: Tin — a Database Engine so Tiny, its Name had to be Shortened

Standard URLshttp://stocks.com/GOOG/2009/10/29/094X.txt

Shortcutshttp://stocks.com/GOOG/today(.txt)http://stocks.com/GOOG/last(.txt)

Rangeshttp://stocks.com/GOOG/???????

Page 58: Tin — a Database Engine so Tiny, its Name had to be Shortened

Use HTTP Headers

HTTP/1.1 206 Partial content Date: Wed, 15 Nov 1995 06:25:24 GMT Last-Modified: Wed, 15 Nov 1995 04:58:08 GMT Content-Range: bytes 21010-47021/47022 Content-Length: 26012 Content-Type: image/gif

Necessitates ASCII or UTF-16 + Padding

Page 59: Tin — a Database Engine so Tiny, its Name had to be Shortened

Content-Range

rfc 2616section 3.12

—›

Page 60: Tin — a Database Engine so Tiny, its Name had to be Shortened

Create your own Units

range-unit = day | hour | record

Page 61: Tin — a Database Engine so Tiny, its Name had to be Shortened

HandlingQuery Strings

Page 62: Tin — a Database Engine so Tiny, its Name had to be Shortened

twitter.com/timanglade?type=reply&to=barood

twitter.com/timanglade?to=barood&type=reply

twitter.com/timanglade/cdf83ef5422c3146ebb14dac0aa84f69.txt

—› 301 redirect

Page 63: Tin — a Database Engine so Tiny, its Name had to be Shortened

.Cacheeverything

Page 64: Tin — a Database Engine so Tiny, its Name had to be Shortened

Say helloto my little friend

the #

Page 65: Tin — a Database Engine so Tiny, its Name had to be Shortened

.Triage mode

Page 66: Tin — a Database Engine so Tiny, its Name had to be Shortened

twitter.com/timanglade#20091029-1520

twitter.com/timanglade-20091029.txt

—› Serve with 302 redirect

Let the client parse.

Page 67: Tin — a Database Engine so Tiny, its Name had to be Shortened

.Offsetting

Page 68: Tin — a Database Engine so Tiny, its Name had to be Shortened

People tend to…request even times

twitter.com/timanglade_0900-0930.txt

stocks.com/GOOG/…_0900-0930.txtvs.

stocks.com/GOOG/…_0850-0920.txtvs.

Page 69: Tin — a Database Engine so Tiny, its Name had to be Shortened

.Ops

Page 70: Tin — a Database Engine so Tiny, its Name had to be Shortened

REST is notplug & play!

Page 71: Tin — a Database Engine so Tiny, its Name had to be Shortened

XML ParsingAnyone

?

Page 72: Tin — a Database Engine so Tiny, its Name had to be Shortened

what I did initially-!origin: "sonde 001" !value: <%= ! sum = 0! ! collection.each {|t| sum += t["reading"].to_i}! ! sum/collection.size %>

<% collection.each do |temp| %>- origin: "sonde 001"! location: <%= temp["place"] %> !value: <%= (temp["reading"].to_i - 32)/1.8.floor %><% end %>

Page 73: Tin — a Database Engine so Tiny, its Name had to be Shortened

—›bx

Page 74: Tin — a Database Engine so Tiny, its Name had to be Shortened

current state of researchdifferent syntaxes have been proposedinversion formulae: write once, works both wats

Page 75: Tin — a Database Engine so Tiny, its Name had to be Shortened

The road ahead

Page 76: Tin — a Database Engine so Tiny, its Name had to be Shortened

ensesMOREL

Page 77: Tin — a Database Engine so Tiny, its Name had to be Shortened

odeMORE

C

Page 78: Tin — a Database Engine so Tiny, its Name had to be Shortened

A&Q(hopefully)

Page 79: Tin — a Database Engine so Tiny, its Name had to be Shortened

twitter.com/timanglade

linkedin.com/in/timanglade

facebook.com/[email protected]

Page 80: Tin — a Database Engine so Tiny, its Name had to be Shortened

Tin

Page 81: Tin — a Database Engine so Tiny, its Name had to be Shortened

A database engine so tiny,

its name had to be shortened

Tin