Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf ·...

27
Like what you hear? Tweet it using: #Sec360

Transcript of Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf ·...

Page 1: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

Like what you hear? Tweet it using: #Sec360

Page 2: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

HADOOP SECURITY

Like what you hear? Tweet it using: #Sec360

Page 3: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

HADOOP SECURITY About Robert:

School: UW Madison, U St. Thomas

Programming: 15 years, C, C++, Java

Security Work: §  Surescripts, Minneapolis (present) §  Big Retail Company, Minneapolis §  Big Healthcare Company, Minnetonka

OWASP Local Volunteer

CISSP, CISM, CISA, CHPS Email: [email protected]

Twitter: @msp_sullivan

Page 4: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

HADOOP SECURITY History

What is new?

Common Applications

Threats

Security Architecture

Secure Baseline and Testing

Policy Impact

Page 5: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

HADOOP HISTORY •  2002 : Doug Cutting & Mike Cafarella: Nutch

•  Crawl and index hundreds of millions of pages

•  2003: Google File System paper released

•  2004: Google MapReduce paper released

•  2006: Yahoo formed Hadoop 5 to 20 nodes

•  2008: Yahoo, Hadoop “behind every click”

•  2008: Google spun off Cloudera 2,000 Hadoop nodes

•  2008: Facebook open sourced Hive for Hadoop

•  2011: Yahoo spins out Hortonworks •  Hortonworks Hadoop 42,000 nodes, hundreds of petabytes

Derrick Harris “The History of Hadoop from 4 nodes to the future of data”, gigamon.com

Page 6: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

HADOOP IS The Apache Hadoop software library is a framework that allows for the

distributed processing of large …

-  Software Framework

-  Distributed Processing

-  Large Data Sets

-  Clusters of Computers

-  High Availability

-  Scale to Thousands of Machines

Link:

https://developer.yahoo.com/hadoop/tutorial

Page 7: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

MAPREDUCE IS NEW

REDUCE

MAP

Page 8: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

HADOOP COMMON APPLICATIONS

1. Web Search 2. Advertising & recommendations 3. Security Threat Identification 4. Fraud Detection 5. Patient Record Search

Page 9: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

Source: Yahoo: https://developer.yahoo.com/blogs/ydn/hadoop-yahoo-more-ever-54421.html

Page 10: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

PATIENT MATCHING AT SURESCRIPTS

-  Surescripts provides a Patient Matching service -  230 Million Patients -  Over 1 Billion matches last year -  Requirements:

-  Reliability and performance -  Data Protection at rest is required -  Data Protection in transit is required -  Comprehensive security logging is needed -  ISO 27001 & EHNAC Audit Accreditation status must be

maintained

Page 11: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

NOW WHAT?

SECURE THE BEES

Page 12: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

HADOOP THREAT MODEL 1)   Unauthorized data access (protected health information access)

2)   Unauthorized data change

3)   Unauthorized job submission, delete or change

4)   Task may access other tasks or access local data

5)   Rogue DataNode, NameNode or Job Tracker

6)   User spoofing to submit workflow as another user

From:

“Adding Security to Apache Hadoop”, Das, O’Malley, Rhadia, Zhang, 2011, http://hortonworks.com/wp-content/uploads/2011/10/security-design_withCover-1.pdf

Page 13: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

HADOOP SECURITY -  Network Security

-  Authentication

-  Authorization

-  Auditing

-  Data Protection

Admins

Data Nodes Management Nodes

Applications

Enterprise Identity, Logging, Encryption, Key Management

Application Users

Page 14: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

DATA PROTECTION -  Network Security

-  Authentication

-  Authorization

-  Auditing

-  Data Protection -  Encryption at rest;

-  Volume, file -  Encryption in transit:

-  HTTPS

Admins

Data Nodes Management Nodes

Applications

Enterprise Identity, Logging, Encryption, Key Management

Application Users

HTTPS HTTPS

Page 15: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

SECURITY AUDITING -  Network Security

-  Authentication

-  Authorization

-  Auditing -  Failed/Successful Authn. -  System changes -  Access to PHI -  Application logs: HDFS,

YARN, MapReduce…

-  Data Protection

Admins

Data Nodes Management Nodes

Applications

Enterprise Identity, Logging, Encryption, Key Management

Application Users

Page 16: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

AUTHORIZATION -  Network Security

-  Authentication

-  Authorization -  Limit user access to

function -  Limit user access to objects -  Manage delegation of

access

-  Auditing

-  Data Protection

Admins

Data Nodes Management Nodes

Applications

Enterprise Identity, Logging, Encryption, Key Management

Application Users

Page 17: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

AUTHENTICATION -  Network Security

-  Authentication -  All users, all applications,

all access paths -  Apache Knox Gateway

-  Authorization

-  Auditing

-  Data Protection

Admins

Data Nodes Management Nodes

Applications

Enterprise Identity, Logging, Encryption, Key Management

Application Users

HTTPS

Page 18: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

NETWORK SECURITY -  Network Security

-  Authentication

-  Authorization

-  Auditing

-  Data Protection

Admins

Data Nodes Management Nodes

Applications

Enterprise Identity, Logging, Encryption, Key Management

Application Users

Page 19: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

HADOOP SECURE MODE Apache Hadoop Secure Mode: 2.6.0 (March 14’)

-  Authentication -  Covers HDFS, YARN, MapReduce & Web Console -  Uses central LDAP Server or Active Directory -  Requires Kerberos keytabs for each application

-  Authorization -  Each Hadoop service has a list of users and groups -  Group permissions on HDFS filesystem components

-  Audit -  Hadoop log, YARN log, other logs

-  Data Protection -  Encryption in transit between Hadoop services & clients -  Encryption in transit between DataNodes -  Encryption in transit between web console & clients (HTTPS) -  Encryption at rest for HDFS columns

Page 20: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

HADOOP SECURE MODE Apache Hadoop Secure Mode: 2.6.0 (March 14’)

Data Access

Data Change

Job Submission

Task Access

Rogue Node

User Spoofing

Network Security

Authentication

Authorization

Audit

Data Protection

Page 21: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

APACHE KNOX The Apache Knox Gateway is a REST API Gateway for interacting with

Hadoop clusters. The Knox Gateway provides a single access point for all REST interactions with Hadoop clusters.

Knox can provide:

•  Authentication (LDAP and Active Directory Authentication Provider)

•  Federation/SSO (HTTP Header Based Identity Federation)

•  Authorization (Service Level Authorization)

•  Auditing

Integrations:

- WebHDFS (HDFS), Templeton (Hcatalog), Stargate (Hbase), Oozie, Hive/JDBC

Status: Incubating

Page 22: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

APACHE RANGER A centralized security framework to manage fine grained access control.

Status: Incubating

Authentication

•  Kerberos in native Apache Hadoop

•  Secured by the Apache Knox Gateway via the HTTP/REST API

Authorization •  on the folder and file level, via HDFS •  on the database, table and column level, via Hive •  on the table, column family and column level, via HBase

Audit

User access auditing in HDFS, Hive and HBase at IP address, Resource/resource type, Timestamp, Access granted or denied

Data Protection

•  Wire, volume and file/column encryotion

•  HDFS Transparent Encryption (TDE)

•  Third-Party Partners (Hortonworks)

Administration

•  Policy management, administration and delegation

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.0/Ranger_U_Guide_v22/index.html#Item1.1

Page 23: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

HADOOP SECURITY POLICY Authentication of processes:

-  May go into existing application security policy

Security Logging requirements:

-  Which applications must be logged?

-  Add node identifier to standard log records

De-anonymization Issues

-  Sparse data can be de-anonymized through matching to public sources

-  Could 200 days of tweets be matched to any of my de-identified data?

Key Management & Business Continuity

Page 24: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

BUILD A SECURITY BASELINE -  Start with your Vendor’s distribution

-  Add your company’s sauce

-  Review Hadoop Security Benchmark project at the Center For Internet Security:

-  Apache Hadoop 2.6.0 Benchmark -  Community Discussion -  Editors and members get free access to validation tools -  Everyone gets free access to baselines -  Registration is moderated. That means human registrants are approved and

receive a welcome email. -  Link:

-  http://tinyurl.com/HadoopSecurityBenchmark

Page 25: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

HADOOP SECURITY REVIEW 1.  Start with the threats

2.  Choose your diagram

3.  Ask the standard security questions: u Network Security u Authentication u Authorization u Security Audit u Data Protection

4.  Update your policy

5.  Build a Security Baseline

Page 26: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

HADOOP SECURITY RESOURCES 1.  Apache “Hadoop in Secure Mode

http://tinyurl.com/hadoopSecureMode 2.  Yahoo Hadoop Tutorial

https://developer.yahoo.com/hadoop/tutorial 3.  Securosis: “Securing Big Data: Security Recommendations for Hadoop and NoSQL

Environments”, 10/12/2012, Adrian Lane https://securosis.com/assets/library/reports/SecuringBigData_FINAL.pdf

4.  Cloudera: “Introduction to Hadoop Security” http://tinyurl.com/cloudera50security

5.  Hortonworks: “Security for Enterprise Hadoop”

http://hortonworks.com/innovation/security/ 6.  Center for Internet Security: Hadoop Security Baseline

http://tinyurl.com/HadoopSecurityBenchmark

Page 27: Like what you hear? Tweet it using: #Sec360secure360.org/.../05/Hadoop-Security-S360-2015v8.pdf · - Review Hadoop Security Benchmark project at the Center For Internet Security:

QUESTIONS

?

Updates at http://www.confidentialsoftware.com