Using sqlite database in android with sqlite manager browser add ons
NoSQL replacement for SQLite (for Beatstream) - TUTtjm/seminars/nosql2012/sqlite-to-nosql.pdf ·...
-
Upload
vuongkhanh -
Category
Documents
-
view
220 -
download
1
Transcript of NoSQL replacement for SQLite (for Beatstream) - TUTtjm/seminars/nosql2012/sqlite-to-nosql.pdf ·...
NoSQL replacement for SQLite
(for Beatstream)
Antti-Jussi Kovalainen
Seminar OHJ-1860: NoSQL databases
Backend
• Very light
– Access & modify data
– Relay commands to Last.fm
• Currently Ruby on Rails
• So, we’ll focus on the backend
Users
• Few users
• Not many fields: username, email, password, lastfm_key
• Read when user logs in
• Write rarely
• Because users have their own playlists
Songs
• Huge list of objects (array)
• Read after user logs in
• Write rarely
• Basically just meta-data about a song: title, artist,
album, tracknum, path, etc…
Playlists
• Possibly Huge lists of objects (arrays)
• Read a good amount during a day
• Write a lot (probably)
• Owned by a user
• Contains songs
– Also need to track song’s position on the playlist
Why I chose SQLite
• Easy-to-use, simple, familiar
• Ready-to-use on new Rails project
• Simple data model get a simple DBMS?
• Can easily implement CRUD, playlist sorting, etc.
• Great for rapid prototyping
• Doesn’t require separate server installation
Why Replace?
1. Fear of Bad Concurrency
– Multiple users + SQLite = Bad memories
• Writing playlists takes time
• Writing the songs list takes time
• SQLite locks up or corrupts data
– Sadness
2. Try something new
– Schemaless == even better for prototyping?
At The Moment
• Moved to JSON files
– Songs and Playlists are in JSON files
– Users are in SQLite
• SQLite was just getting in the way
– SQLite <--> SQL result <--> JSON
– Keep It Simple
• But this feels kinda icky
Features We Want • Standalone / embeddable / portable
– Can embed into application, invisible to Beatstream server user or admin
– No separate server installation etc.
• Simple, easy-to-use
• SQLite-like performance or better
• Lightweight
• No availability, concurrency or consistency problems
– The database can’t be the reason a song won’t play
– When adding a song to a playlist, it should be there and be there always
– The frontend asks only once which songs are in the DB and they should be there always
Features We Want (2)
• Can store whole song library
– Also read it all fast (or we cache it)
– No need for sorting
• Can store sort information somehow
– Sorting of playlists & playlist songs
– Not as obvious as it sounds
• Can fetch only certain user’s playlists
– Relational data!
• (Please, work with Ruby or JRuby)
CAP (2)
• CA seems best for us
– Partition Tolerance is useless
• But it’s the core idea of NoSQL?
• Allows horizontal scaling and people like that
– Some say CP == CA (source: http://dbmsmusings.blogspot.fi/2010/04/problems-with-cap-and-yahoos-little.html)
• AP might be ok too with “eventual consistency”
– MongoDB is like this and people use it for the strangest things
Why NoSQL?
• Data is simple, just lists of objects
– Not relational data
– Songs, Playlists, Users
– Don’t need big queries, joins, analytics, versioning, or anything super-
– Document-oriented systems seem nice for this
• Maybe JSON-oriented?
Why NoSQL? (2)
“try to limit the work done over your data and just store it, then retrieve it and
show it to the user, do not over process the information. Manipulate JSON on
the user interface and send it to the database with few or even none
modification.”
– djondb (http://djondb.com/documentation.html)
I like this idea.
Why NoSQL? (3)
• Standalone / embeddable / portable
– Most SQLite replacement suggestions were
NoSQL systems
• Schemaless
– Think different
Why NoSQL? (4)
• NoSQL systems usually concentrate on
performance and scalability
– I’m not really concerned about those things right
now
– Maybe should not pick NoSQL then?
Why NoSQL? (5)
• Try new things
• Experiment
• It’s what the cool kids use
• And in the end…
– Tech doesn’t matter. Until it does.
Criteria
• Beatstream’s backend is small-scale with:
– Less than 100,000 rows
– Performance rarely a problem
– No horizontal scaling
And I’m stressing over database choice?
Criteria
• Standalone / embeddable / portable (no separate server software)
• Lightweight
• Keeps It Simple
• Can store and access our data easily
– Key-value, document-oriented, column-oriented, or something which fits
• Good performance
• No availability, concurrency or consistency problems
• (Works with Ruby, or with Java for use with JRuby)
Criteria (2)
In the future:
• Someone might create a Spotify competitor
using Beatstream with millions of users
• Scaling etc., becomes important, but it’s not important
now
Kyoto Cabinet
• Key-value store
• Standalone file-based database (also in-memory)
• Support for many languages (Ruby, Java, C#, PHP, etc.)
• Popular (community)
• Hash table or B+ tree based
– Can’t decide which one would be better for Beatstream, have to test
– Hash table: random sorted – not an issue, sorting in frontend
Source: http://fallabs.com/kyotocabinet/
Kyoto Cabinet (2)
Notes:
• Replace SQLite on apps that store simple data
• How do I store song and playlist data in a key-value store?
– Need two collections/tables: songs and playlists
– Own database files for songs and playlists?
– key: filepath --> value: song meta-data as JSON?
LevelDB
• Key-value store
• Standalone file-based database
• Support for many languages (C/C++, Ruby, Java)
• Built by Google for use in Google Chrome
• Sorting by key
• Fast write & read, slow if value is large
Sources: http://code.google.com/p/leveldb/, http://en.wikipedia.org/wiki/LevelDB
LevelDB (2)
Notes:
• Same as with Kyoto Cabinet: good for
simple use, but how do I use key-value
with Beatstream?
MongoDB
• Document-oriented
– JSON-style documents!
– Collections are logical and easy!
• For many languages (C/C++, Ruby, Java, C#, PHP, …)
• Easy to use, simple
– “Mongo is a schemaless relational database” – Some people
– Indexes instead of map/reduce functions
• Active community
– Lots of plugins, etc.
MongoDB (2)
Notes:
• Crash right after successful write: might lose data
• Embedding is not simple, need to build/find a
C++ wrapper & launch DB process in app.
source: http://stackoverflow.com/questions/6115637/can-mongodb-be-used-as-an-embedded-database
CouchDB (2)
Notes:
• I would have a HTTP API accessing a HTTP API?
• Embedding is hard
– Need to install Erlang somehow on the user’s
computer
RavenDB
• Document-oriented
• Standalone directory/file-based database
• For .NET and Javascript (NodeJS)
• Detailed info on how RavenDB works, listen:
http://herdingcode.com/wp-content/uploads/HerdingCode-0083-Ayende-Rahien-on-
RavenDB.mp3
Terrastore
• Document-oriented
• For Java, maybe can attach to JRuby?
• Main feature is scalability without sacrificing
consistency
• Seems easy to use
Eliminated Choices • Berkeley DB
– No time to investigate…
• Cassandra
– “the right choice when you need scalability and high availability”
• SimpleDB
– “Optimized to provide high availability”
– Not really standalone / embeddable / portable, but in the cloud and “invisible”
• djondb
– Not standalone / embeddable / portable
• Couchbase
– Could not find a way to run embed, maybe it’s the same as with CouchDB
Conclusions
• NoSQL systems promote their horizontal
scalability, replication, sharding, etc.
– Features I don’t really care about right now
• Feels like I’m looking at the wrong thing in the
wrong place (for Beatstream at least)
– Only time will tell
Conclusions (3)
Simple key-values in SQLite:
• Kyoto Cabinet and LevelDB seem like
excellent replacements
– Use cases: Queue, word dictionary, user
database, document database, session
management, CMS cache
Extra
• Later: Convert Users-table in SQLite to the
new NoSQL database
– Songs can be re-created
– Playlists is a new feature, hasn’t been released
Extra (2)
• Redis could be embeddable:
“[communicate over unix domain socket] you can fork your main
process, then run one of the exec*() functions in the child to start
Redis.”
source: http://code.google.com/p/redis/issues/detail?id=276
Links!
http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
http://blog.nahurst.com/visual-guide-to-nosql-systems
http://www.cs.tut.fi/~tjm/seminars/nosql2012/NoSQL-Intro.pdf
https://speakerdeck.com/u/kplawver/p/nosql-an-introduction
http://fallabs.com/kyotocabinet/rubydoc/
http://blog.creapptives.com/post/8330476086/leveldb-vs-kyoto-cabinet-my-findings
http://herdingcode.com/wp-content/uploads/HerdingCode-0083-Ayende-Rahien-on-
RavenDB.mp3