Apache ranger meetup

37
Securing Hadoop using Ranger Raj Nadipalli Director Professional Services, Zaloni [email protected] 09.22.2016

Transcript of Apache ranger meetup

Securing Hadoop using Ranger

Raj Nadipalli

Director Professional Services, Zaloni [email protected]

09.22.2016

Agenda Ø  Security Landscape in Hadoop

Ø  Role of Ranger

Ø  Ranger Key Features

Ø  Demo

Ø  Q&A

Overview

Security Landscape in Hadoop (open source)

Authentication Who am I?

AD/LDAP Kerberos Apache Knox

Authorization What can I do?

Apache Ranger Apache Sentry

Audit What happened?

Apache Ranger

Data Protection

SSL KMS

Ranger in a slide

5

Ø  Centralizedsecurityframework,authen*ca*on,audi*ng,dataencryp*onandsecurity

Ø  Fine-grainedaccesscontroloverHadoopØ  ComponentsSupported:HDFS,Hive,Hbase,Storm,YARN,Knox,KaCa,Solr

Ø  Manage/Createpoliciesusingbrowser

Ø  ManageAudittrackingandpolicyanaly*csinHDFS,RDMSorSOLR

Ø  SupportsgovernancewithTagbasedpoliciesØ  RESTAPI’sforpolicymanagementautomate,integrateandextend

Key Components of Ranger

http://www.slideshare.net/RommelGarcia2/apache-ranger?qid=1150145e-a144-4603-9165-a09b2ae5ece0&v=&b=&from_search=4

Securing HDFS

Ranger in Action - HDFS

http://www.slideshare.net/RommelGarcia2/apache-ranger?qid=1150145e-a144-4603-9165-a09b2ae5ece0&v=&b=&from_search=4

Ranger administration portal

9

List HDFS policies

10

UnderHDFSpolicieswecanviewalltheHDFSpoliciescreatedandwhichuser(s)/group(s)hasaccesstowhichpolicies

Actions delete / edit

Policy Name

Groups/users assigned to policies

Create HDFS policy

11

UnderHDFSpolicywecanedit/createHDFSpolicies,thispageshowshowtocreateapolicyatuserlevelandprovideappropriatepermissions.

Access error in Audit

12

UnderAudittabadmincanviewwhichusertriedtoaccesswhichdirectory,hereusermukeshgotaccessdeniedasitdidnothadthepermissiontoaccess/testRangerdirectory

Access Denied to user mukesh

List HDFS policies for group

13 Under HDFS policies we can view all the HDFS policies created and which user(s) / group(s) has access to which policies

Create HDFS policy for group

14

UnderHDFSpolicywecanedit/createHDFSpolicies,thispageshowshowtocreateapolicyatgrouplevelandprovideappropriatepermissions.

Access given to a group

Securing Hive

List policies of Hive

16

UnderHivepolicieswecanviewalltheHivepoliciescreatedandwhichuser(s)/group(s)hasaccesstowhichpolicies

Hive policy for database User assigned to a policy

Create policy for Hive

17 UnderHivepolicywecanedit/createHivepolicies,thispageshowshowtocreateapolicyatuserlevelandprovideappropriatepermissions.

Access error in Audit

18 UnderAudittabadmincanviewwhichusertriedtoaccesswhichtable/database,hereusermukeshgotaccessdeniedasitdidnothadthepermissiontocreatetableundertestrangerdatabase.

Securing HBase

Create HBase policy

20

UnderHBasepolicywecanedit/createHBasepolicies,thispageshowshowtocreateapolicyatuserlevelandprovideappropriatepermissions.

Access error in Audit

21

UnderAudittabadmincanviewwhichusertriedtoaccesswhichtablehereusernabadeepgotaccessdeniedasitdidnothadthepermissiontoputdataintabletestranger.

Audit Logs

Audit logs in JSON format ForeachoftheservicelikeHDFS,HivetherewillauditlogsgeneratedifenabledinAmbari

23

Audit logs in JSON format

24

HDFS Audit File structure

25

Audit Log Storage Options HDFS

Long term storage that can be used to understand user event trends and predict anomaly

RDBMS

MySQL, Oracle, Postgres, SQL Server

Solr

Good for quick reporting metrics to understand user event trends

Log4j Appenders

Best practices to use HDFS in Ranger

27

•  ChangeHDFSumaskto077

fs.permissions.umask.mode=077

•  IdenLfydirectorywhichcanbemanagedbyRangerpolicies/apps/hive,/apps/Hbase

•  IdenLfydirectorieswhichneedtobemanagedbyHDFSnaLvepermissions/tmpand/userto700

•  EnableRangerpolicytoauditallrecords

Best practices to use Hive in Ranger

28

•  HiveServer2accesswithlimitedHDFSaccess

  ColumnlevelaccesscontroloverHivedata

•  Hiveserver2,andHDFSfilesthroughPig/MRjobs  hive.server2.enable.doAsissetto"true“

•  HiveCLIaccess

Atlas & Ranger

Tag Based Policies in Atlas Ø  Atlas and Ranger combination supports automation for governance and policies

Ø  Atlas is where tags get set on metadata for example, a Customer table in Hive can be tagged with value “PII”

Ø  Ranger policies can be created on these tags to enforce access

Ø  Ranger shows audit logs on access

Source: https://cwiki.apache.org/confluence/display/RANGER/Tag+Based+Policies

Ranger Tag based policy flow

Tag Service Setup – Ranger Admin

Source: https://cwiki.apache.org/confluence/display/RANGER/Tag+Based+Policies

Tag Policy Setup

Source: https://cwiki.apache.org/confluence/display/RANGER/Tag+Based+Policies

Tag Policy Expiry

Backup

References http://www.slideshare.net/trihug/trihug-october-apache-ranger http://www.slideshare.net/RommelGarcia2/apache-ranger https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=53741207 http://hortonworks.com/blog/best-practices-for-hive-authorization-using-apache-ranger-in-hdp-2-2 https://cwiki.apache.org/confluence/display/RANGER/Tag+Based+Policies

Q&A [email protected] @ranadipa