Hbase +Fosdem+2010+Nosql 2
-
Upload
lahorichargha -
Category
Documents
-
view
222 -
download
0
Transcript of Hbase +Fosdem+2010+Nosql 2
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
1/43
My Life with HBase
Lars George, CTO of WorldLingo
Apache Hadoop HBase Committerwww.worldlingo.com www.larsgeorge.com
http://www.worldlingo.com/http://www.larsgeorge.com/http://www.larsgeorge.com/http://www.worldlingo.com/ -
8/8/2019 Hbase +Fosdem+2010+Nosql 2
2/43
WorldLingo
Co-founded 1999
Machine Translation Services
Professional Human Translations
Offices in US and UK
Microsoft Office Provider since 2001
Web based services
Customer Projects
Multilingual Archive
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
3/43
Multilingual Archive
SOAP API
Simple calls
putDocument()
getDocument()
search()
command()
putTransformation() getTransformation()
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
4/43
Multilingual Archive (cont.)
Planned already, implemented as customerproject
Scale:
500million documents
Random Access
100% Uptime
Technologies? Database
Zip-Archives on file system, or Hadoop
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
5/43
RDBMS Woes
Scaling MySQL hard, Oracle expensive (andhard)
Machine cost goes up faster speed Turn off all relational features to scale Turn off secondary indexes too Tables can be a problem at sizes as low as
500GB Hard to read data quickly at these sizes Write speed degrades with table size Future growth uncertain
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
6/43
MySQL Limitations
Master becomes a problem
What if your write speed is greater than asingle machine
All slaves must have same write capacitiesas master (cant check out on slaves)
Single point of failure, no easy failover
Can (sort of) solve this with sharding
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
7/43
Sharding
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
8/43
Sharding Problems
Requires either a hashing function ormapping table to determine shard
Data access code becomes complex
What if shard sizes become too large?
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
9/43
Resharding
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
10/43
Schema Changes
What about schema changes ormigrations?
MySQL not your friend here
Only gets harder with more data
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
11/43
HBase to the Rescue
Clustered, commodity(-ish) hardware
Mostly schema-less
Dynamic distribution
Spreads writes out over the cluster
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
12/43
HBase
Distributed database modeled on Bigtable Bigtable: A Distributed Storage System for
Structured Data by Chang et al.
Runs on top of Hadoop Core Layers on HDFS for storage
Native connections to MapReduce
Distributed, High Availability, HighPerformance, Strong Consistency
http://labs.google.com/papers/bigtable.htmlhttp://labs.google.com/papers/bigtable.htmlhttp://labs.google.com/papers/bigtable.htmlhttp://labs.google.com/papers/bigtable.html -
8/8/2019 Hbase +Fosdem+2010+Nosql 2
13/43
HBase
Column-oriented store Wide table costs only the data stored
NULLs in row are 'free'
Good compression: columns of similar type Column name is arbitrary
Rows stored in sorted order
Can random read and write Goal of billions of rows X millions of cells
Petabytes of data across thousands of servers
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
14/43
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
15/43
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
16/43
Tables
Table is split into roughly equal sizedregions
Each region is a contiguous range of keys,
from [start, to end) Regions split as they grow, thus
dynamically adjusting to your data set
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
17/43
Tables (cont.)
Tables are sorted by Row
Table schema defines column families
Families consist of any number of columns
Columns consist of any number of versions
Everything except table name is byte[]
(Table, Row, Family:Column, Timestamp) Value
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
18/43
Tables (cont.)
As a data structure
SortedMap(
RowKey, List(
SortedMap(
Column, List(
Value, Timestamp
)
)
)
)
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
19/43
Server Architecture
Similar to HDFS Master Namenode
Regionserver Datanode
Often run these alongsaide each other! Difference: HBase stores state in HDFS
HDFS provides robust data storageacross machines, insulating against failure
Master and Regionserver fairly statelessand machine independent
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
20/43
Region Assignment
Each region from every table is assignedto a Regionserver
Master Duties:
Reponsible for assignment and handlingregionserver problems (if any!)
When machines fail, move regions
When regions split, move regions to balance Could move regions to respond to load
Can run multiple backup masters
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
21/43
Master
The master does NOT Handle any write requests (not a DB master!)
Handle location finding requests
Not involved in the read/write path Generally does very little most of the time
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
22/43
Distributed Coordination
Zookeeper is used to manage masterelection and server availability
Set up as a cluster, provides distributed
coordination primitives An excellent tool for building cluster
management systems
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
23/43
HBase Storage Architecture
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
24/43
HBase Public Timeline
November 2006 Google releases paper on Bigtable
February 2007 Initial HBase prototype created as Hadoop contrib
October 2007
First "useable" HBase (0.15.0 Hadoop) December 2007
First HBase User Group
January 2008 Hadoop becomes TLP, HBase becomes subproject
October 2008 HBase 0.18.1 released
January 2009 HBase 0.19.0 released
September 2009 HBase 0.20.0 released
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
25/43
HBase WorldLingo Timeline
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
26/43
HBase - Example
Store web crawl data Table crawl with family content
Row is URL with columns
content:data stores raw crawled data content:language stores http language header
content:type stores http content-type header
If processing raw data for hyperlinks andimages, add families links and images
links: column for each hyperlink
links: column for each image
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
27/43
HBase - Clients
Native Java client/API get(Get get)
put(Put put)
Non-Java clients Thrift server (Ruby, C++, Erlang, etc.)
REST server (Stargate)
TableInput/TableOutputFormat forMapReduce
HBase shell (jruby)
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
28/43
Scaling HBase
Add more machines to scale Automatic rebalancing
Base model (BigTable) scales past 1000TB
No inherent reason why Hbase couldnt
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
29/43
What to store in HBase
Maybe not your raw log data...
... but the results of processing it withHadoop!
By storing the refined version in HBase,can keep up with huge data demands andserve to your website
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
30/43
!HBase
NoSQL Database! No joins
No sophisticated query engine
No transactions (sort of) No column typing
No SQL, no ODBC/JDBC, etc. (but there is
HBql now!) Not a replacement for your RDBMS...
Matching Impedance!
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
31/43
Why HBase?
Datasets are reaching Petabytes Traditional databases are expensive to
scale and difficult to distribute
Commodity hardware is cheap andpowerful (but HBase can make use ofpowerful machines too!)
Need for random access and batchprocessing (which Hadoop does notoffer)
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
32/43
Numbers
Single reads are 1-10ms depending ondisk seeks and caching
Scans can return hundreds of rows in
dozens of ms Serial read speeds
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
33/43
Multilingual Archive (cont.)
44 Dell PESC1435, 12GB RAM, 2 x 1TBSATA drives
Java 6
Tomcat 5.5 88 Xen domUs
Apache
Hadoop/HBase
Tomcat application servers
Currently split into two clusters
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
34/43
Lucene Search Server
43 fields indexed 166GB size
Automated merging/warm-up/swap
Looking into scalable solution Katta
Hyper Estraier
DLucene
Sorting?
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
35/43
Multilingual Archive (cont.)
5 Tables Up to 5 column families
XML Schemas
Automated table schema updates
Standard options tweaked over time
Garbage Collection!
MemCached(b) layer
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
36/43
Layers
Data
Cache
App
Web
LWS
Network Firewall
Director 1
Apache 1
Tomcat 1
MemCached1
HBase
MemCachedn
Tomcat n
Apache n
Tomcat 1 Tomcat n
Director n
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
37/43
Map/Reduce
Backup/Restore Index building
Cache filling
Mapping
Updates
Translation
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
38/43
HBase - Problems
Early versions (before HBase 0.19.0!) Data loss
Migration nightmares
Slow performance
Current version
Read HBaseWiki!!! Single point of failure (name node only!)
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
39/43
HBase - Notes
RTF(ine)M HBaseWiki, IRC Channel
Personal Experience:
Max. file handles (32k+) Hadoop xceiver limits (NIO?)
Redundant meta data (on name node)
RAM (4GB+)
Deployment strategy Garbage collection (use CMS, G1?)
Maybe not mix batch and interactive?
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
40/43
Graphing
Use supplied Ganglia context or JMXbridge to enable Nagios and Cacti
JMXToolkit: swiss army knife for JMX
enabled servers:http://github.com/larsgeorge/jmxtoolkit
http://github.com/larsgeorge/jmxtoolkithttp://github.com/larsgeorge/jmxtoolkit -
8/8/2019 Hbase +Fosdem+2010+Nosql 2
41/43
HBase - Roadmap
HBase 0.20.x Performance New Key FormatKeyValue New File FormatHfile New Block CacheConcurrent LRU
New Query and Result API New Scanners Zookeeper IntegrationNo SPOF in HBase New REST Interface
Contrib Transactional Tables Secondary Indexes Stargate
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
42/43
HBase - Roadmap (cont.)
HBase 0.21.x Advanced Concepts Master RewriteMore Zookeeper
New RPC Protocol (Avro)
Multi-DC Replication Intra Row Scanning
Further optimizations on algorithms and data
structures Discretionary Access Control
Coprocessors
-
8/8/2019 Hbase +Fosdem+2010+Nosql 2
43/43
Questions?
Email: [email protected]@apache.org
Blog: www.larsgeorge.com
Twitter: larsgeorge
mailto:[email protected]:[email protected]:[email protected]://www.larsgeorge.com/http://www.larsgeorge.com/mailto:[email protected]:[email protected]:[email protected]