phpXperts seminar 2010 CodeMan! with noSQL!

37
PhpXperts seminar 2010, Work for Fun!! CodeMan! with noSQL! nhm tanveer hossain khan http://hasan.we4tech.com

description

A presentation on noSQL (structured storage) introduction. this presentation also includes why people should be choosing cassandra over database system.

Transcript of phpXperts seminar 2010 CodeMan! with noSQL!

Page 1: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

CodeMan! with noSQL!

nhm tanveer hossain khanhttp://hasan.we4tech.com

Page 2: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Super heros!

Page 3: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

...So this is about You as a CodeMan!... CodeMan! CodeMan! help help!

Page 4: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Now let's get back to our discussion –

Database Stuffs!

Page 5: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Database!

SELECT books.* FROM books

LEFT JOIN users ON books.user_id = users.id

WHEREusers.age < 15

Relational Database System

Query language!

Page 6: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Known issues with

existing database system!• Maintaining Relations among the tables

• Table, Page, Row level Locking

• Huge data produces huge Fat indexes

• Transactional Operations

• Parsing SQL query syntax

• Multi tables Joining Query

Page 7: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

So what about ? noSQL!!

(Structured Storage System)

Page 8: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

About noSQL ?

• NoSQL == Structured storage!

• An initiative to use

alternative of relational database system

• Targeting on the following goals

–Performance– Autonomous – Minimizing cost

Page 9: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Why structured storage over relational database

• Getting rid of fear–In larger expansion is CHEAP!

–No SQL parsing overhead–No table joining–No relation consideration–No single big chunk of data

Page 10: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Let's Think

about a quite bigger system!

Page 11: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Our Possible context!

• Load of data• Huge data growth• Extensive database operations• Extensive I/O traffic (read/write)• Fault tolerance• Data consistency• Assured availability

Page 12: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Let's check the facts!

Page 13: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

MySQL and other noSQL systems performance

comparison by Yahoo! research

Page 14: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Page 15: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Page 16: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

What others are talking about?

Or whoever skipped

relational database?

Page 17: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

At facebook!

• Facebook! Around 140 nodes!

• Facebook open sourced cassandra!

Page 18: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Digg.com declared to use cassandra!The fundamental problem is endemic to the relational

database mindset, which places the burden of computation on reads rather than writes. This is completely wrong

for large-scale web applications, where response time is critical. It’s made much worse by the serial nature of most applications. Each component of the page blocks on reads from the data store, as well as the completion of the operations that come before it. Non-relational data stores reverse this model completely, because they don’t

have the complex read operations of SQL.

Read at home!

Page 19: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Twitter moved their

statuses on cassandra!We have a lot of data, the growth factor in that data is huge and the rate of growth is accelerating. We have a system in place based on shared mysql + memcached but its quickly becoming prohibitively costly (in terms of manpower) to operate. We need a system that can grow in

a more automated fashion and be highly available.

Read at home!

Page 20: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Few Related References!

• Check out Yahoo! Research's “Cloud Serving Benchmark”

• MySQL and Memcached: End of An era?

• An article on how Yoshinori scaled MySQL as noSQL to serve 750,000 qps! Click here

Page 21: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Time is almost running out!! let's taste some

noSQL curry!

Page 22: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

List of available structured databases

• Hbase - yahoo!

• Voldemort – used in LinkedIn

• MongoDB

• MemcacheDB

• Riak

• Redis

• Cassandra – facebook, twitter, digg, Rackspace, Reddit

• HyperTable

Page 23: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Let's go with Cassandra! (cassandra.apache.org)

Page 24: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Why cassandra!

• Tested (Facebook, Twitter, Reddit, Digg, Rackspace etc..)

• Decentralized and No single point of failure

• Flexible schema

• Elastic

• Durable, Data center and Disaster management aware

• Highly scalable write

Page 25: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

PHP and Cassandra!

Using Phpcassa library

Page 26: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Install cassandra

• Go and download cassandra from here - http://cassandra.apache.org/

• Ensure you have java runtime on your pc

Page 27: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Create keyspace and column family

• Extract your download cassandra archive

• Edit “cofig/storage-config.xml” file in your text editor

• Go to the “Keyspaces” block

• Add “AddressBook” Keyspace

Page 28: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Configuration!

<Keyspace Name="AddressBook"><ColumnFamily Name='Addresses' CompareWith="TimeUUIDType"/>

<ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>

<ReplicationFactor>1</ReplicationFactor><EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>

</Keyspace>

Page 29: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Kick start cassandra!

• $ cd apache-cassandra-0.6.x

• $ ./bin/cassandra

• Get PHP thrift library – phpcassa

• https://github.com/hoan/phpcassa/

Page 30: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Show code!!

• Inserting data into cassandra!

• Listing all added data

• Remove an existing data

Page 31: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

More about cassandra!

• All nodes need to be in low latency fiber connected

• Still in alpha version! Might have issues!

• High RPM hard disk (ie. 15000, 10000)

Page 32: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Before you move!, you think about -

common sense!

Page 33: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Common sense!

• Avoid Big Design Up Front !

• Benchmark your existing system performance!

• Experiment and calculate cost!

• Structured database with Dr. Eric Brewer CAP theorem – Consistency - Is the data I’m looking at now the same if I look at it

somewhere else? – Availability - What happens if my database goes down?

– Partitioning - What if my data is on different networks?

Page 34: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

That's it for today :)Thanks!

Now you guys should treat me ;)

Page 35: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

Find out the best restaurant in the town!Passion food reviewers community!http://restaurant.welltreat.us

Page 36: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

- nhm tanveer hossain khan (hasan)   IT Director, Tasawr Interactive   [email protected]   Blog: http://hasan.we4tech.com   Twitter: http://twitter.com/we4tech   love programming,    used to write code in Ruby, Java and PHP!

Who am i?

Page 37: phpXperts seminar 2010 CodeMan! with noSQL!

PhpXperts seminar 2010, Work for Fun!!

NOW Q/A