SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy...
Transcript of SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy...
![Page 1: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/1.jpg)
SCALABLE WEB PROGRAMMING
CS193S - Jan Jannink - 2/02/10
![Page 2: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/2.jpg)
Weekly Syllabus1.Scalability: (Jan.)
2.Agile Practices
3.Ecology/Mashups
4.Browser/Client
5.Data/Server: (Feb.)
6.Security/Privacy
7.Analytics*
8.Cloud/Map-Reduce
9.Publish APIs: (Mar.)*
10. Future
* assignment due
![Page 3: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/3.jpg)
Data is the Core
Maybe I should just go back and rename the course
Data Storage, Access, Transport, Presentation
keep it generic
design for incremental system growth
avoid unbounded growth at any layer
duplicate elimination, query filtering
![Page 4: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/4.jpg)
Data Storage
Reliable Persistence
almost every other DB feature is overkill in web apps
Simplicity/Genericity
avoid a system that grows more complex over time
Recoverability
Backups are great but not the first line of defense
![Page 5: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/5.jpg)
Data Scalability
Data access spreading
Balance reads to writes
Data set partitioning
Parallel Access
Hot Spots
Data Caching, Randomizing Keys
![Page 6: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/6.jpg)
Database Scalability
Keep schemas ultra generic
consider storing all data in a single table
Constraint management often works against availability
increases the number of query errors
Caching is key
commonly accessed data accounts for majority of requests
![Page 7: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/7.jpg)
Flat File to Data Center
Single table, single server
Distributed in memory cache
Master - Slave single master
Table partitioning, multiple masters
Read partitioning, multiple locations
![Page 8: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/8.jpg)
Attribute Data Store
Basic data tuples
(ID, Content)
equivalent to a hashtable
(ID1, ID2, Content)
complete for representation of semi structured data
similar to RDF data model
![Page 9: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/9.jpg)
Attribute Graph Model
ID #1attr 1attr 2
ID #2attr 1attr 3
Link ID #2
ID #1 ValueAttribute ID
![Page 10: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/10.jpg)
Attribute Model Benefits
Trivial to manage objects
Easy to repair broken constraints
Trivial to partition tables
Natural to support huge data graphs
Automatically support every new feature
future proof
![Page 11: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/11.jpg)
Drawbacks
Schema does not guide query style
Semantics buried in object and attribute definitions
Need to encode these semantics in the server code
Some advance planning needed for data path design
![Page 12: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/12.jpg)
Agile Data
Read/write ratio is near 10/1
80-20 access pattern
20% of data accounts for 80% of access
Construct pages from no more than 2 DB queries
reassess page or data design otherwise
Future proof your design by not locking into a complex schema
![Page 13: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/13.jpg)
Agile App Design
Make the data path the core of the system
Design data access API to allow different backends
ease transition to different clouds
Centralize access methods into a few classes at most
simplify addition of an in memory cache
![Page 14: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/14.jpg)
Rapid Prototyping vs. Scale
Most sites are built front to back, UI first, back end last
pressure to demo by investors
we know better what we can see in front of us
Ruby on Rail ‘magically’ generates DB schemas
gets apps out the door fast
difficult to start from data centered design
![Page 15: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/15.jpg)
Extreme Programming Conundrum
Main Principle: don’t design more than immediate needs
Main Caveat: don’t make the same mistake twice
Main Compromise
don’t build more than what you need
learn how to design minimalist systems that don’t dead end
![Page 16: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/16.jpg)
Twitter Example
Basic idea: put IM status on the web
extreme case of long tail data access
Largest Ruby on Rails system
scheduled downtime
limited feature growth
data access APIs are all throttled
![Page 17: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/17.jpg)
Agile Cache Design
Store objects either as raw DB rows or as server objects (or both)
use ID as key
optimize for access pattern
read only => DB rows, frequent updates => server objects
Store entire query results too
use query string or hash as key
![Page 18: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/18.jpg)
Agile Parallelization
Worth starting at the Webserver level
round robin routing is usually sufficient
lock users to a given server
associate closely linked content to closer web servers
extend CDN (Content Delivery Network) concept
![Page 19: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/19.jpg)
Master Slave DB Concepts
Start with one DB server
about 10 reads per write
Add extra DBs
writes copied by log file
End with 10 identical DBs
1 read per write at full load
![Page 20: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/20.jpg)
New Frontier: Autopartition
Route queries to DB servers by key
When Server reaches access or query speed threshold
Bring up standby DB servers
Copy tables
Split DB key space evenly
Update DB client routing table
![Page 21: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/21.jpg)
Worth Checking Out
Memcached
http://memcached.org/
MySQL replication
http://dev.mysql.com/doc/refman/5.5/en/replication.html
RDF
http://www.w3.org/TR/rdf-concepts/
![Page 22: SCALABLE WEB PROGRAMMING · 2010. 2. 3. · Attribute Model Benefits Trivial to manage objects Easy to repair broken constraints ... Ruby on Rail ‘magically’ generates DB schemas](https://reader036.fdocuments.net/reader036/viewer/2022071112/5fe7a68b1686b5136e6c8c76/html5/thumbnails/22.jpg)
Q & A Topics
Data Loss, Downtime, Backups
Index and query optimizing
when to do it
Other architectures
document oriented DBs
column oriented DBs