Vodafone xone fev142013v3 ext
-
Upload
infinitegraph -
Category
Documents
-
view
590 -
download
1
description
Transcript of Vodafone xone fev142013v3 ext
Big Data and Security
Michel Burger
Gold mining is about dirt management
0.05 ounces/ton
About 11850 Amps to generate around 8.4 Tesla fields (about 150000 times the earth magnetic field) but they operate at low
Voltage
A lot of what LHC is about is electricity flow management
How BIG?
BIG data is like the LHC combined with gold extraction- Huge amount of data -> 6.6 Zettabytes/year by 2016
(Cisco Cloud Index)
- Big flow of data -> 400TB/day (Facebook)
- LHC generates 10-15 Petabytes/year of data for each experiment
The essence of new service providers
Service
Data
Consumes
Produces
Improvesexperience
Many free servicesOne data setand common semantic
The more context the more efficient and
the more value
Value enriched Datagenerates
revenue
Example: Search/Information
Management :
Rated auction/Selling:
User
BI Based Revenue Models
(eg Advertisement)
Core Semantic
Data SetMindmap
ServiceServic
e
Revenue from existing services will shrink
Additionalrevenue from new services
Classic Approach• Structured Data• Data in the range of Gigabytes to Terabytes• Centralized (Data is imported in analytics)• Batch based• Data silos
Transaction RelationalDatabase
DataWarehous
e
Analyse
Where is the data that answer my questions ?
ETL ETL ETL
• Multi Structured Data• Data in the range of Terabytes to Petabytes• Distributed/Federated (Analytics grab the data)• Streaming based• Holistic Data
Big Data Approach
2
3
n
1
Organize AnalyseStream
Clusters
Here are the questions and the data for the answers
A new patternS
ou
rces
Str
eam
Con
su
mp
tion
API
Data cardSim Card
Content Services
KnowledgeReferences
Applications
Social Networks
Connected Things (Consumer, Enterprise)Connected
Devices
Network Core
IT Infrastructure
RAN
Premise Gateway
• Many different data structures• Many different ways to extract the
data• Many different locations (even for
the same type of data)• Batch and Realtime based• Buffered or stream• Correlation parameters
• Buffering, Routing, Filtering
• Structured/Unstructured store
• Event Collector• Batch Process/Multi
Structure Stream • Multi Stage
Store/Process
Non
Real-
tim
eR
eal-
tim
e
Low
level S
em
an
tic
Ric
h S
em
an
tic
Ch
eap
Sto
rag
eHig
h E
fficie
nt
Sto
rag
e
Report Statistics
Data as a Service
Graph Network/Analysis
Neura l Network/Analysis
• Structured Buffering• Proprietary• Graph• Neural Network• Relational
• Unstructured Buffering
• Streaming• Taping at
Source• Taping on
Stream • Consumption
to Source
With added securityS
ou
rces
Str
eam
Con
su
mp
tion
API
Data cardSim Card
Content Services
KnowledgeReferences
Applications
Social Networks
Connected Things (Consumer, Enterprise)Connected
Devices
Network Core
IT Infrastructure
RAN
Premise Gateway
Report Statistics
Data as a Service
Graph Network/Analysis
Neura l Network/Analysis
• Securing the infrastructure (public, private)• Policy (internal/external)• On-going assessment (DDOS, Penetration …)• Data leakage• Migration
• Securing the identity• Validating ID• Anonymization
• Securing the access• Distributed permission/preference• 3rd party permission
• Strong access control based on industry standard (user, dev, application)
• Strong authorization control based on open standard
• Analytics applied to Analytics
Final thoughts
1. We need to eliminate the silos– Sources or Usage
2. Still very much a collection of technologies– The assembly is still very complex
3. Is everything about events?4. We need to handle the CAP theorem more
appropriately5. What is the user experience (not just the end user
but also the admin)
Thank You