©2017 Cambridge Semantics Inc. All rights reserved. Company Confidential
Accelerating InsightSmart Data Lake Customer Success Stories
Peter HorowitzPrincipal - AdvisoryPwC
Alok Prasad, PresidentBen Szekely, VP – Solution EngineeringCambridge Semantics
©2017 Cambridge Semantics Inc. All rights reserved.
ACCELERATING INSIGHT
©2017 Cambridge Semantics Inc. All rights reserved.
Anzo Smart Data Lake: Accelerating InsightA Graph-based Platform to Solve Today’s Data and Analytics Challenges
Data Ingestion& Mapping
AutomatedETL Generation
CollaborativeMapping Text Processing
DataCataloging
Data & ModelGovernance
Active Metadata Management
Role-Based Security
Discovery & Analytics
Knowledge Discovery
Custom User Dashboards Self-Serve LiveExtracts
In-Memory MPP Query
Graph Marton Demand
ELT, Model BasedData Integration
Document Search
Anzo Smart Data LakeValue Proposition
Actionable Insights All
Data Sources
EnterpriseData Lakes
“Last Mile”Analytics
Allowing Enterprises to Rapidly Create Purpose-built Discovery & Analytics Solutions on the Cloud of Choice
Data Ingestion& Mapping
Load and Map
any data……. from anywhere… at any time!
DataCataloging
Organize, Integrate, Share and Govern
any data assets.. more efficiently greater security
In-Memory MPP Query
Graph Exploration
Interconnected … trillions of facts… 100x the speed…
Discovery & Analytics
Ask & Answer
any questions… any combination. any user…..
©2017 Cambridge Semantics Inc. All rights reserved.
Anzo Smart Data Lake: Accelerating InsightA Graph-based Platform to Solve Today’s Data and Analytics Challenges
Accelerated Insights – 100x faster than the nearest competitor
AutomatedETL Generation
CollaborativeMapping Text Processing
Enterprise Knowledge Graph
©2017 Cambridge Semantics Inc. All rights reserved.
Sample Customers
©2017 Cambridge Semantics Inc. All rights reserved.
Peter HorowitzPrincipal – Advisory Bankingand Capital ManagementPwC
Ben Szekely VP Solution EngineeringCambridge Semantics
Accelerating Insight – Customer Successs Story
Accelerating Insights: Using Smart Data Lakes Successfully
March 2017
PwC
Overview
$53 bn 1000%The Dilemma – Given current landscape within many financial service institutions, the standard data journey takes TOO LONG
Typical data journey
Winning Strategy - Attack every step in the process to reduce time to market.
8
Requirements Sourcing Quality
EvaluationQuality
RemediationAnalysis and
Reporting
8
PwC
Why is this so important?
Sources: Average OM% from Morningstar, Inc.; IAIG Index
For global top banks, comparing average operating margin percent for previous 5 years against index of information architecture, investment and governance demonstrates strong correlation.
Proper investment in information strategy, architecture and governance PAYS OFF!
Proper data architecture improving or in place.
Insufficient or lagging data architecture investment, execution, governance.
PwC
Old approach is Challenged
The existing landscape presents great challenges since it is based on relational databases and point-to-point mappings.
Operational Data Source
Operational Data Source
Staging DB Metadata
Aggregated Data
Detailed Data
Data Mart
Reporting
Analytics
Predictive modelingFiles
ETLData Mart
Data Mart
EDW
Integration Barriers
Restricted entry
Narrow payload
Data source mappingMulti-stage, brittle ETL
required to integrate diverse data sources
Standard model requiredEnforces a standard model, no
exceptions, leading to loss of valuable information context
Limited insightsRigidity undermines the agility required in data
analytics
Challenges
PwC
Original data and models maintainedData is loaded seamlessly without transformation or
mapping, irrespective of format
New approach
Unified model
Consolidated model unifies disparate source models into a
comprehensive and shared canonical model
The smart data lake provides a centralized data store with a flexible schema where data is first loaded and then transformed; a process known as schema-on-read.
Operational Data Source
Operational Data Source
Files
SmartData Lake
Model 1Model 2Model 3
Operational Data Source
Operational Data Source
Files
Richer insights
Cross region and domain view supports population-based analytics, hidden relationship discovery and reduces risk of siloed or inaccessible
Reporting
Analytics
Predictive modeling
Benefits
Graph Data Model(i.e. Ontology)
Graph store designs capture the nature of data relationships and supports a variety of data representations, known as “Shared Semantics”
PwC
Case Study: Next generation insider trading and fraud surveillance driven by structured and unstructured data
Requirements Sourcing Quality
EvaluationQuality
RemediationAnalysis and
Reporting
Accelerators:
Firm Wide Risk Dashboard
Employee Risk Dashboard
Analysis Overview• Ingested emails, IMs, web browser
logs, cookies, phone records, contacts, news feeds
• Ingested futures, options trades, market price data
Impact• Data was ingested and linked in
hours.• Profiles developed for employee
roles• Outlier behavior effortlessly
identified• Employees given risk ratings
allowing for prioritization of detailed compliance investigation.
PwC
Cast Study: Terabytes of data ingested and analyzed in a fraction of the time revealing complex relationships between clients and transactions
13
Analysis Overview• 5.5 billion transactions on behalf of
25 million customers for 35 million accounts.
• Based on reference data and public information, prepared “social” network of clients analyzing edges (i.e., connections) between them.
Impact• Data was ingested and linked in
hours.• Edges developed and weighted
based on Pointwise Mutual Information algorithms.
• Resulting graph database supported recursive analysis, which would have been dramatically longer using RDBS, allowing for identification of suspicious networks
Suspicious Account Bridge
Customer Network
Requirements Sourcing Quality
EvaluationQuality
RemediationAnalysis and
Reporting
Accelerators:
©2017 Cambridge Semantics Inc. All rights reserved.
INDEFINITE
Drug Discovery Preclinical Product Development
FDA Review Scale-Up to Mfg.Post-MarketingSurveillance
ONE FDA-APPROVED
DRUG
0.5 – 2 YEARS6 – 7 YEARS3 – 6 YEARS
NUMBER OF VOLUNTEERS
PHASE 1 PHASE 2 PHASE 3
5250~ 5,000 – 10,000
COMPOUNDS
PRE-
DISC
OVE
RY
20–100 100–500 1,000–5,000
IND
SUBM
ITTE
D
NDA
/ BL
A S
UBM
ITTE
D
R & D Intelligence Data Lake (CI)
Clinical Trial Operations Data Lake
Product Development Data Lake (SDP)
Anzo Smart Data Lakes Across Pharma
Clinical Trial Exploratory Analytics Data Lake Source of Influence
Project Evolution Data Lake
PV & Safety Data Lake
Medical Advisory Board Analytics Data Lake
Clinical Data Standard Management
Voice of the CustomerData Lake
Real World Evidence Patient Data Lake
©2017 Cambridge Semantics Inc. All rights reserved.
Anzo Smart Data Lake: Accelerating Insight
Disparate Sources
Insight
Exploratory AnalyticsKnowledge Discovery
Data on Demand
Automated Ingestion
Rich Models
Scalability
Security
Enterprise Knowledge Graph
Governance
©2017 Cambridge Semantics Inc. All rights reserved.
Rules, Security, Governance
The Smart Patient Data Lake
Connectors Models Rules Analytics & Tools
Patient Data Systems
R&D
Patient Safety
Clinical Trial Ops
Health Economics
Patient Data Lake
DataOn
Demand
Auto
mat
ed In
gesti
onInsight for Decision
MakersImproving patient outcomes, safety,and comfort
Reducing the time bring medicines to patients
Lowering the cost of healthcare
Insurance Claims
Clinical Trials
Rx Data
Health Records
Genetic Data
©2017 Cambridge Semantics Inc. All rights reserved.
IT B
uild
and
Dep
loym
ent
Anzo Smart Data Lake
Traditional BI
and Analytics
Tool Chains
Add
New
Dat
a
Add
New
Dat
a
Ad
d N
ew D
ata
A
dd N
ew D
ata
Disrupting the Time-to-Value Function
Tim
e an
d Re
sour
ce
Inve
stm
ents
Insights and Value
Traditional BI
and Analytics
Tool Chains
Anzo Smart Data Lake
©2017 Cambridge Semantics Inc. All rights reserved.
Anzo Smart Data LakeA Graph-based Platform to Disrupt Today’s Data and Analytics Challenges
Connectors Models Rules Analytics & Tools
ASDL Customer Fingerprint - Intellectual Property
Data Ingestion& Mapping
AutomatedETL Generation
CollaborativeMapping
Text Processing
DataCataloging
Data & ModelGovernance
Active Metadata Management
Role-Based Security
Discovery & Analytics
Automated Query Generation
User Dashboards and Custom UI/UX
Self-Serve Live
Extracts
In-Memory MPP Query
Graphmartson Demand
ELT, Model BasedData Integration
Document Search
Actionable Insights
Enterprise Data Sources
EnterpriseData Lakes
“Last Mile”Analytics
©2017 Cambridge Semantics Inc. All rights reserved.
Anzo Smart Data Lake: Accelerating InsightA Graph-based Platform to Solve Today’s Data and Analytics Challenges
Anzo Smart Data LakeSelect Capabilities
Actionable Insights All
Data Sources
EnterpriseData Lakes
“Last Mile”Analytics
Allowing Enterprises to Rapidly Create Purpose-built Discovery & Analytics Solutions on the Cloud of Choice
Data Ingestion& Mapping
AutomatedETL Generation
CollaborativeMapping Text Processing
DataCataloging
Data & ModelGovernance
Active Metadata Management
Role-Based Security
In-Memory MPP Query
Graphmartson Demand
ELT, Model BasedData Integration
Document Search
Discovery & Analytics
Knowledge Discovery
Custom User Dashboards Self-Serve LiveExtracts
©2017 Cambridge Semantics Inc. All rights reserved.
Click here to request a demo
Top Related