基調講演:「ビッグデータのセキュリティとガバナンス要件」 #cwt2015

Click here to load reader

  • date post

  • Category


  • view

  • download


Embed Size (px)

Transcript of 基調講演:「ビッグデータのセキュリティとガバナンス要件」 #cwt2015

PowerPoint Presentation

Clarke Patterson |

# Cloudera, Inc. All rights reserved.

# Cloudera, Inc. All rights reserved.

# Cloudera, Inc. All rights reserved.

Cloudera 1

# Cloudera, Inc. All rights reserved.



TB*security optimized for Intel chipsApache Hadoop

# Cloudera, Inc. All rights reserved.SQL

DiscoverImpalaSolrModelSAS, R, Spark, MahoutCloudera

# Cloudera, Inc. All rights reserved.Cloudera 1

# Cloudera, Inc. All rights reserved.IBM Research: Single User (solid bar) vs 10 User Response Time (striped bar)(Lower bars = better)



ApplicationsSystem Integration

1,900 ClouderaOperationalTools


Enterprise Data HubSecurity and AdministrationUnlimited StorageProcessDiscoverModelServe

# Cloudera, Inc. All rights reserved.Cloudera partners more broadly and deeply across the Hadoop ecosystem than any other vendor. With over 1200 partners and counting, our partnerships offer:Compatibility with your existing tools and skills160+ certified on Cloudera 5, including all 12 of the 12 Gartner Business Intelligence Magic Quadrant leadersFlexible deployment optionsOn-premisesPublic, private, or hybrid cloudAppliances and engineered systemsPartnerships you can trustDeep engineering relationshipsComprehensive certification program8


# Cloudera, Inc. All rights reserved.

Dell SecureWorks 1,000 20,000

# Cloudera, Inc. All rights reserved.Link to account record in SFDC (valid for Cloudera employees only): https://na6.salesforce.com/0018000000rBJSQ?srPos=0&srKp=001

Dell SecureWorks designs better defenses through complex analytics on historical security breaches. Background: Dell SecureWorks is on deck 24 hours a day, 365 days a year, to help protect the security of its customers assets in real time. To meet its enormous data processing challenges, Dell SecureWorks requires leading-edge technologies for cost-effectively storing, scaling, and analyzing massive amounts of information in real time. SecureWorks processes ~20 billion events per day in real time. An event is a single transaction for a security device, such as a firewall or an intrusion prevention system (IPS), or a log event from an application or operating system.

Challenge: Each day, the SecureWorks organization investigates ~1,000 potential security incidents and analyzes ~20,000 pieces of malicious software. Leveraging this global intelligence enables SecureWorks to offer improved threat protection on behalf of its customers. To deal with the massive amounts of information it collects, SecureWorks requires highly scalable solutions for processing, storing, and analyzing big data in a cost-conscious manner.

Historically, the SecureWorks organization met this need with proprietary technologies. But a few years ago, it started looking for next-generation solutions that would deliver greater scalability at lower cost, including those based on open-source software and industry standard hardware.

Solution: This search culminated in the deployment of the Dell | Cloudera Solution, comprised of Apache Hadoop open source software, Dell PowerEdge C servers, Force10 switches, and services from Cloudera and Dell. Hadoop offered a highly scalable system with built-in availability, and the capability to run very-high-analytics types of activities.

Results: By migrating to the Dell | Cloudera solution, SecureWorks reduced storage costs from $17 per GB to only 21 cents per GB. The Hadoop environment paid for itself in one year. And the cost savings delivered by this solution will continue delivering substantial savings as SecureWorks doubles its business and associated data volumes every 18 months.


FINRA 3001,000~2,000

# Cloudera, Inc. All rights reserved.Link to account record in SFDC (valid for Cloudera employees only): https://na6.salesforce.com/0018000000pvwoy?srPos=0&srKp=001

A financial regulatory body is building a holistic picture of US market activity by looking at 30 billion events per day.

Background: The Financial Industry Regulatory Authority, or FINRA, is an independent regulator authorized by U.S. Congress to promote investor protection and market integrity. FINRA oversees every brokerage firm and broker doing business with the U.S. public and monitors trading on the U.S. stock markets.

Challenge: Protecting market integrity while keeping pace with the billions of events that take place every day presents unique challenges:Market volumes are volatile and steadily increasing. Exchanges are dynamically evolving. Regulatory rules are continuously being created and enhanced. New securities products are regularly introduced. Market manipulators are constantly innovating.Solution: Cloudera was selected as the Hadoop distribution for the critical process of building the market event graph database described above, and for providing rapid access to that data for regulatory analysts, based on a few key factors: - Clouderas software platform delivers a pervasive, enterprise-ready Apache Hadoop distribution. - Cloudera Manager provides a sophisticated and user-friendly management console, simplifying and streamlining Hadoop cluster administration.- Clouderas deep expertise, available through Professional Services and Support, would be critical to solving technical challenges inherent in a Hadoop-based architecture of this scale. Using Hadoop to build the market event graph database was the first major challenge to the redeployment program. Every day, FINRA receives data feeds containing orders, quotes, and trades from securities firms and various exchanges. Since transactions may span several data providers over days or weeks, understanding order lifecycles requires that huge volumes of data be linked across these different feeds and updated as new information is received. This requires very rapid data access, which was provided by implementing Apache HBase as part of FINRAs CDH deployment.

Results: The new system must be able to handle increases in market volumes. The Cloudera Hadoop platform will assist FINRA in keeping pace with this growth in the following ways:- By enabling horizontal rather than vertical scaling, FINRA is no longer obliged to buy ever larger, more expensive data appliances to support increasing volumes.- Similarly, the Hadoop architecture doesnt impose largest available appliance limitations.The use of industry standard hardware allows for more flexibility in selecting suppliers and architectures to cope with future growth.

FINRA's Big Data platform supports its mission of investor protection and market integrity by helping the organization increase performance and accelerate innovations in the following ways:- The use of HBase to access the order lifecycle graph database has reduced response times for certain complex queries by orders of magnitude. Queries that took hours now take seconds. One complex query in particular that took ninety minutes to run was reduced to ten seconds. This creates a far more interactive experience, allowing analysts to rapidly iterate and quickly converge on answers that would have been prohibitive in the prior system. - With its open source core for data processing, FINRA is able to leverage the rapid innovation cycles in the Hadoop and Big Data community.

FINRA's program to implement market regulation applications using Big Data and cloud technologies including Cloudera is expected to provide a net annual benefit of USD $10 to $20 million. These savings are derived from reducing costs of the operational infrastructure in a few key ways.- The specialized nature of its legacy data appliances resulted in significant hardware costs and ongoing support requirements. FINRAs Hadoop-based platform allows the use of less expensive, industry standard hardware. - Unlike proprietary data appliances, a Hadoop-based architecture lends itself to deployment on a public cloud such as Amazon Web Services. This allows FINRA to benefit from the operational economies of scale of such a vender, as well as providing the elasticity to avoid overprovisioning to handle peak loads.- The legacy environment was also challenged with capacity limitations, causing FINRA to spend much effort and analysis determining where and when available capacity could be used for various types of analytics. By removing the capacity limitations, the Hadoop/cloud combination virtually eliminates this effort. - Cloudera Manager automatically manages extremely complex operational tasks within Hadoop, allowing FINRA to focus resources and expertise on areas of strategic regulatory importance rather than fundamental IT tasks. This saves costs while increasing FINRA's ability to accomplish core objectives.


Hadoop PCI

# Cloudera, Inc. All rights reserved.KEY MESSAGE ON THIS SLIDE:Our relationship first began with MasterCard. They chose Cloudera Enterprise for fraud detection and to optimize their DW infrastructure. And then expanded to form a partnership with MC Advisors, the consulting arm of MasterCard.

MasterCard requires that any technology handling its applications or payment card data files must have full PCI certification. Receiving this important certification allows MasterCard the opportunity to integrate Hadoop datasets with other environments that are already PCI-certified. "Data privacy and protection is a top priority for MasterCard," said Gary VonderHaar, Chief Technology Officer, Architecture at MasterCard. "As we maximize the most advanced technologies from partners and vendors, they must meet the rigorous security standards weve set. With Clouderas commitment to the same standards, we now have additional options in how we manage our data center.