Co-ownership of Real Property Tenancy by the Entirety Joint Tenancy Tenancy in Common.
Toward Better Multi-Tenancy Support from HDFS
-
Upload
dataworks-summithadoop-summit -
Category
Technology
-
view
728 -
download
0
Transcript of Toward Better Multi-Tenancy Support from HDFS
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Toward Better Multi-Tenancy Support from HDFS
Xiaoyu YaoEmail: [email protected]
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
About myself⬢Member of Technical Staff at Hortonworks since 2014
⬢Apache Hadoop Committer and PMC member.
⬢Currently working on HDFS.
⬢This talk is to help better understanding of HDFS multi-tenancy support and ongoing work for better resource management.
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
⬢Overview
⬢Hadoop multi-tenancy features
⬢HDFS resources and multi-tenancy offerings
⬢HDFS resource management via resource coupon
⬢Q&A
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Overview
⬢Centrally managed infrastructure – Consolidate to simplify management and lower TCO– Better utilization and efficiency
⬢Requirement– Resource Sharing– Resource Isolation– Resource Control
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Multi-Tenancy Support from Hadoop
Resource Sharing
Resource Isolation
Resource Management
HBASE Y Namespace, Region Server Group
Quota
YARN Y Queue, Node Label...
Capacity Scheduler,...
HDFS Y Federation Quota, FairCallQueue, Backoff
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Resources
⬢Capacity– Namespace– Storage Space– Storage Type
⬢Operational Resources– Namenode
•RPC– Datanode
•Disk & Network
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Resource Sharing/Isolation – Federation
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Capacity Management – Quota
⬢Quota– Namespace– StorageSpace– HDFS-7584 Quota by Storage Types
⬢ Limitations– Static– Per directory– No per user/job control
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Operational Resource Management – Namenode RPC Isolation (1)
⬢ Internal RPC– DN->NN block report, heartbeat, etc.– ZKFC->NN liveness check
⬢External RPC– Client RPCs from HDFSClients such as MR jobs/Hive queries/HBase
Client ListenerReader
ReaderCall Queue
Handler
Handler
Handler
FSN
10
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Operational Resource Management – Namenode RPC Isolation (2)⬢Use case:
–HFDS access from normal jobs impacted by offending jobs–Internal RPCs impacted by External RPCs –One blocked RPC method could affect others
⬢Protect HDFS internal RPCs:–Dedicated service RPC server/port
•Isolate DN->NN block report, heartbeat, etc.–Dedicated lifeline RPC server/port
•Protect ZKFC->NN liveness check
⬢All external RPCs go to the default port (e.g., 8020)
11
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Resource Management – Name Node RPC Call Queue
⬢ In multi-tenancy scenario, call queue should play an important role like a shock absorber to accommodate different workload, converting busty arrivals into smooth, steady departures.
⬢Good call queue– queue without call bloat– catches and handles bursts with no more than a temporary increase of queue delay– maximum server utilization
⬢Bad call queue– queue that exhibits call bloat – queue filled up and stay filled upon bursts– low utilization and high queue latency
12
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Resource Management - Fair Call Queue⬢Before HADOOP-9640 LinkedBlockingQueue
– Single queue – Client blocked and timeout/fail when queue is full
⬢HADOOP-9640 - Fair Call Queue
– Multiple priority levels and call queues with different processing priority– Each RPC is assigned a priority by scheduler – High priority RPC calls are put into call queue with higher probability of being executed.
Scheduler
Queue 0
Queue ...
Queue 2
Multiplexer (WRR)
13
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Resource Management – Namenode RPC Throttling <1>⬢HADOOP-10597 Backoff when the call queue is full
–Send back a Retriable exception–Let the client do exponential wait and retry instead of blocking/timeout/failed the call.
14
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Resource Management – Namenode RPC Throttling <2>
⬢HADOOP-12916 Backoff based on response time–The basic idea: Backoff earlier to avoid call queue overload so that namenode
can recover quickly.–Low priority calls get backed off if response time of high priority call is over
predefined threshold. –More per user/queue metrics added for trouble shooting.
15
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Resource Management – Namenode RPC Throttling <3>
⬢Abstract scheduler interface from call queue for pluggable RPC priority assignment–DefaultRpcScheduler: all RPC calls with same priority–DecayRpcScheduler: from original FairCallQueue priority assigned based on previous call volumes of users.–Other experimental schedulers: configurable list of high priority user/group for low latency jobs, medium priority user/group for normal jobs and low priority user/group for batch jobs.
16
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS resource management - QoS
⬢Use case:– Allow high performance QoS mechanism with minimum decoding effort on server side
⬢HADOOP-9194 QoS support for Hadoop RPC – One bytes in RPC header to facilitate QoS mechanism– E.g., differentiate OLTP/OLAP, batch/streaming against the same HDFS
⬢ Limitation– No mechanism level implementation yet
17
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS resource management with YARN
⬢Use Case– Priority inversion without centralized resource management (e.g., RPC calls from high priority YARN jobs may be put into low priority HDFS namenode call queue)– Identify and manage ”bad” caller effectively
⬢Namenode – RPC handler– FairCallQueue offers the fairness use of namenode RPC handlers– No guarantee of differentiation
⬢Datanode – I/O bandwidth– No differentiation of writer/reader and bandwidth usage.– Datanode allows static throttling balancer I/O.
18
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Namenode Resource Reservation
⬢HADOOP-13128 propose HDFS namenode resource reservation via resource coupon– From throttling to manage– Similar to delegation token in many aspects– Works for both Kerberos and non-Kerberos cluster– Allows only privileged service user to request resource coupons from namenode. – Coupon can be serialized/de-serialized for use within container.– Coupon can be renewed for long running jobs or canceled after the intended job is finished.
19
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Namenode Resource Coupon
⬢Coupon Identifier– Finer grain owner (MR job ID, Hive Query ID) to help identify and manage “good” and “bad” callers– Resource type (Namenode RPC or Datanode I/O bandwidth)– Flexible management unit for different resources.
•Min/Max percentage (e.g. Namenode RPC) •Absolute value (Datanode I/O bandwidth)
20
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Namenode Resource Coupon Manager (RCM)
⬢Grant/Renew/Cancel resource coupon
⬢Monitor and report resource usage
⬢Check and validate resource use requests
21
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Namenode Resource Pool
HDFS Namenode Resource Pool
Fairness Pool Managed Pool
Applications supporting Resource Coupon (YARN/HBASE)
Legacy Applications without Resource
Coupon
22
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Namenode Resource Coupon Manager (RCM)
NEW Client
YARNResourceManager
HDFS Namenode
RCM
HDFS Datanode
YARN Node Manager
YARN Container
23
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
HDFS Resource Management – Datanode⬢Use case:
– When a client writes to HDFS faster than the disk bandwidth of the DNs, it saturates the disk bandwidth and put the DNs into an unresponsive state.– The client only backs off by aborting / recovering the pipeline, which causes failed writes and unnecessary pipeline recovery.
⬢ Static I/O Throttling – HDFS-7265 Support HDFS IO throttling– HDFS-9796 Use a throttler for replica write in datanode – HDFS-4412 Add throttler for datanode bandwidth– HADOOP-10410 datanode Qos via ioprio_set on DataXceiver thread
⬢Dynamic I/O Throttling– HDFS-7270 Add congestion signaling capability to DataNode write pipline(ECN)
⬢ Future work: I/O bandwidth reservation with resource coupon
24
© Hortonworks Inc. 2011 – 2016. All Rights Reserved24
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thank you!
Q&A