HBase User Group #9: HBase and HDFS

HBase and HDFS

Todd Lipcontodd@cloudera.comTwitter: @tlipcon

#hbase IRC: tlipcon

March 10, 2010

Outline

HDFS Overview

HDFS meets HBase

Solving the HDFS-HBase problemsSmall Random ReadsSingle-Client Fault ToleranceDurable Record Appends

Summary

HDFS OverviewWhat is HDFS?

I Hadoop’s Distributed File System

I Modeled after Google’s GFS

I Scalable, reliable data storage

I All persistent HBase storage is on HDFS

I HDFS reliability and performance are key toHBase reliability and performance

HDFS Architecture

HDFS Design GoalsI Store large amounts of data

I Data should be reliable

I Storage and performance should scale withnumber of nodes.

I Primary use: bulk processing with MapReduce

Requirements for MapReduceI MR Task Outputs

I Large streaming writes of entire files

I MR Task InputsI Medium-size partial reads

I Each task usually has 1 reader, 1 writer; 8-16tasks per node.

I DataNodes usually servicing few concurrent clients

I MapReduce can restart tasks with ease (theyare idempotent)

Requirements for HBaseAll of the requirements of MapReduce, plus:

I Constantly append small records to an edit log(WAL)

I Small-size random reads

I Many concurrent readers

I Clients cannot restart → single-client faulttolerance is necessary.

HDFS Requirements Matrix

Requirement MR HBaseScalable storage X X

System fault tolerance X XLarge streaming writes X XLarge streaming reads X X

Small random reads - XSingle client fault tolerance - X

Durable record appends - X

HDFS Requirements Matrix

Requirement MR HBaseScalable storage X X©

System fault tolerance X X©Large streaming writes X X©Large streaming reads X X©

Small random reads - X§Single client fault tolerance - X§

Durable record appends - X§

Solutions...turn that frown upside-down

hard↔

easy I Configuration Tuning

I HBase-side workarounds

I HDFS Development/Patching

Small Random ReadsConfiguration Tuning

I HBase often has more concurrent clients thanMapReduce.

I Typical problems:

xceiverCount 257 exceeds the limit of

concurrent xcievers 256

I Increase dfs.datanode.max.xcievers → 1024(or greater)

Too many open files

I Edit /etc/security/limits.conf to increasenofile → 32768

Small Random ReadsHBase Features

I HBase block cacheI Avoids the need to hit HDFS for many reads

I Finer grained synchronization in HFile reads(HBASE-2180)

I Allow parallel clients to read data in parallel forhigher throughput

I Seek-and-read vs pread API (HBASE-1505)I In current HDFS, these have different performance

characteristics

Small Random ReadsHDFS Development in Progress

I Client↔DN connection reuse (HDFS-941,HDFS-380)

I Eliminates TCP handshake latencyI Avoids restarting TCP Slow-Start algorithm for

each read

I Multiplexed BlockSender (HDFS-918)I Reduces number of threads and open files in DN

I Netty DataNode (hack in progress)I Non-blocking IO may be more efficient for high

concurrency

Single-Client Fault ToleranceWhat exactly do I mean?

I If a MapReduce task fails to write, the MRframework will restart the task.

I MR relies on idempotence → task failures are nota big deal.

I Thus, fault tolerance of a single client is not asimportant to MR

I If an HBase region fails to write, it cannotrecreate the data easily

I HBase may access a single file for a day at atime → must ride over transient errors

Single-Client Fault ToleranceHDFS Patches

I HDFS-127 / HDFS-927I Clients used to give up after N read failures on a

file, with no regard for time. This patch resets thefailure count after successful reads.

I HDFS-630I Fixes block allocation to exclude nodes client

knows to be badI Important for small clusters!I Backported to 0.20 in CDH2

I Various other write pipeline recovery fixes in0.20.2 (HDFS-101, HDFS-793)

Durable Record AppendsWhat exactly is the infamous sync()/append()?

I Well, it’s really hflush()

I HBase accepts writes into memory (theMemStore)

I It also logs them to disk (the HLog / WAL)

I Each write needs to be on disk before claimingdurability.

I hflush() provides this guarantee (almost)

I Unfortunately, it doesn’t work in ApacheHadoop 0.20.x

Durable Record AppendsHBase Workarounds

I HDFS files are durable once closed

I Currently, HBase rolls the edit log periodically

I After a roll, previous edits are safe

I Not much of a workaround §I A crash will lose any edits since last roll.I Rolling constantly results in small files

I Bad for NN metadata efficiency.I Triggers frequent flushes → bad for region server

efficiency

Durable Record AppendsHBase Workarounds

I HDFS files are durable once closed

I Currently, HBase rolls the edit log periodically

I After a roll, previous edits are safe

I Not much of a workaround §I A crash will lose any edits since last roll.I Rolling constantly results in small files

I Bad for NN metadata efficiency.I Triggers frequent flushes → bad for region server

efficiency

Durable Record AppendsHDFS Development

I On Apache trunk: HDFS-265I New append re-implementation for 0.21/0.22I Will work great, but essentially a very large set of

patchesI Not released yet - running unreleased Hadoop is

“daring”

I In 0.20.x distributions: HDFS-200 patchI Fixes bugs in old hflush() implementationI Not quite as efficient as HDFS-265, but good

enough and simplerI Dhruba Borthakur from Facebook testing and

improvingI Cloudera will test and merge this into CDH3

SummaryI HDFS’s original target workload was

MapReduce, and HBase has different (harder)requirements.

I Engineers from the HBase team plus Facebook,Cloudera, and Yahoo are working together toimprove things.

I Cloudera will integrate all necessary HDFSpatches in CDH3, available for testing soon.

I Contact me if you’d like to help test in April.

todd@cloudera.comTwitter: @tlipcon

#hbase IRC: tlipcon

P.S. we’re hiring!

HBase User Group #9: HBase and HDFS

Technology

Transcript of HBase User Group #9: HBase and HDFS

Apache Kudu - Indico€¦ · KUDU tries to fill the gap • HBASE (on HDFS) excels at – Fast random lookups by key – Making data mutable HDFS excels at • Scanning of large amount

Sweet Storage SLOs with Frosting - USENIX · 2019. 12. 18. · HBase MySQL . 3 Exploratory drill-down Interactive web-serving HDFS Batch analytics HBase MySQL Copy . 4 ... –Reduced

Accelerating Big Data with Hadoop (HDFS, MapReduce and HBase)

HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera

Google File System, HDFS, BigTable, Hbase

High-Performance Design of HBase with RDMA over InfiniBandjhuang95/papers/jian-ipdps12.pdf · 2014-12-31 · Usually, HBase and HDFS are deployed in the same cluster to improve data

Ibis:ScalingPythonAnaly=cs on HadoopandImpalabiconsulting.hu/letoltes/2015budapestbi/budapestbiforum... · 2015-10-27 · HDFS, Kudu, HBase ©"Cloudera,"Inc."All"rights"reserved."

Who - Brown Universitycs.brown.edu/~jcmace/presentations/fonseca2015hpts... · across a cluster of eight machines simultaneously running HBase, Hadoop MapReduce, and direct HDFS clients.

· Distributed System (HDFS, HBase, etc.) htrace4-core API SpanReceiver htrace-web

TRAINING & CERTIFICATION - ONLINE SELF LEARNING€¦ · module 2: hadoop architecture and hdfs • hive • pig • mahout • hbase • hcatalog/hive • hbase administration module

Analysis of HDFS under HBase - usenix.org · 10 JUNE 2014 VOL. 39, NO. 3 FILE SYSTEMS AND STORAGE Analysis of HDFS under HBase A Facebook Messages Case Study TYLER HARTER, DHRUBA

Hadoop, Hbase and Hive- Bay area Hadoop User Group

Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis

HBase - tutorialspoint.com 11 It has easy java API for client. ... acts up on Google File System, likewise Apache HBase works on top of Hadoop and HDFS. Applications of HBase

How we solved Real-time User Segmentation using HBase

HBase @ Facebook - QCon London 2020 · HBase uses HDFS We get the benefits of HDFS as a storage system for free Fault tolerance (block level replication for redundancy) Scalability

雲端計算 Cloud Computing Lab–Hadoop. Agenda Hadoop Introduction HDFS MapReduce Programming Model Hbase.

Informatica PowerExchange for HBase - 10.1.1 - User Guide ......Informatica PowerExchange for HBase User Guide Version 10.1.1 December 2016

10 Million Smart Meter Data with Apache HBase...Relationship between HBase and Hadoop (HDFS) • HBase build on HDFS (Hadoop Distributed File System). Commodity servers MapReduce [Parallel

How Hadoop Changes the Analytics Paradigm...Search Solr Model Machine Learning SAS, R, Spark, Mahout Serve NoSQL Database HBase Streaming Spark Streaming Unlimited Storage HDFS, HBase