Quarterly Economic Trends for Ohio Oil and Gas Industries - April 2014
New Trends in Data Management in the Information Industries
-
Upload
matt-turner -
Category
Technology
-
view
161 -
download
1
Transcript of New Trends in Data Management in the Information Industries
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
New Trends in Data Management in the
Information IndustriesPresented by: Matt Turner, CTO Media and Publishing
February, 2015
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 2
Agenda
Introduction
Information Industries Trends
Top 5 Challenges in the Industry
New Approaches and Solutions
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 3
Hierarchical EraFor your application data!• Application- and
hardware-specific
Data Drives the Need for a New Generation Database
Relational Era“For all your structured data!”• Normalized, tabular
model• Application-
independent query• User control
Any Structure Era“For all your data!”• Schema-agnostic• Massive scale• Query and search• Analytics• Heterogeneous data • Faster time-to-results
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 4
Harnessing Data & Reimagining Applications
Reduce Risk
Manage Compliance
Create New Value from Data
Optimize Operations
Lower TCO / Better IT Economics
Better Decision-making
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 5
MarkLogic:
Best Operational
Data Warehouse
(Aug 2014)
Enterprise NoSQL Database Platform
Flexible Data
Model
Store and manage
JSON, XML, RDF,
and Geospatial data
with a document-
centric, schema-
agnostic database
Scalability
and Elasticity
ACID
Transactions
Search and
Query
Semantics Certified
Security
Hadoop
Integration
Scale to
petabytes of data
without over-
provisioning or
over-spending
Avoid data loss,
data corruption,
and stale
reads—even at
speed and scale
Lightning fast,
sophisticated,
sub-second
search and
query across all
of your data
Store and query
linked data as
RDF and
SPARQL
Make your
Hadoop better
by connecting
it to MarkLogic
Government-
grade, granular,
role-based
security
TRADITIONAL PUBLISHING
FORM BASED
PRODUCTS
DEDICATED
PRODUCT
INFRASTRUCTURE
Product A Dedicated Infrastructure
(database + search engine)
Product B
Product C
Company Data
Industry Data
Filings
Reports
INFORMATION DELIVERY PLATFORM
FORMAT
INDEPENDEN
T
INFORMATION
CENTRIC
DYNAMIC
DELIVERY
Company Data
Industry Data
Filings
Reports
Deliver the right content,
to the right user,
in the right format,
in real time
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 13
Top 5 Requirements for Information Providers
Getting data IN fast isn’t the problem – it’s getting insights OUT Faster!
Data is complex – but users want complexity hidden!
Not everyone has permission to access all the data…
Repurpose, repurpose, repurpose. Repeat
Once you attract them – you must be reliable
1
2
3
4
5
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 14
Traditional Technology
Rows and columns for content strip
information
Title Publication
Date
Category Abstract Section Section 2?
Science
Article 1
3/1/14 Biology Abstract
text . . .
Section
text
Section text
Research
Book
6/4/13 Surgery Abstract
text . . .
Section
text
Section text
Science
Article 2
6/4/05 Chemistry Abstract
text . . .
Section
text
Section text
?
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 15
Traditional Technology
Rows and columns for content strip
information
Hierarchical taxonomies overlap and don’t
capture the complexity
Title Publication
Date
Category Abstract Section Section 2?
Science
Article 1
3/1/14 Biology Abstract
text . . .
Section
text
Section text
Research
Book
6/4/13 Surgery Abstract
text . . .
Section
text
Section text
Science
Article 2
6/4/05 Chemistry Abstract
text . . .
Section
text
Section text
?
ResearchMedicine
Science
Surgery
Orthopedics
Cell Biology
Biochemistry
….
Life Sciences
Biomedical
Sciences
Cell Biology
Biology
Biochemistry
…Chemistry
Microbiology
Biochemistry
…
?
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 16
8. Develop, integrate and test
infrastructure & applications
4. Define schemas, indexes
and services1. Design infrastructure,
services & applications 2. Analyze Data Formats
Articles Books
Industry
DataReports
5. Build databases,
middleware and services
infrastructure
6. Define & implement ETL
processes
The Functional Solution Silos & Treadmill
7. Load and normalize data
3. Define queries & Service
APIs
?
?
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 17
Hierarchical EraFor your application data!• Application- and
hardware-specific
Data Drives the Need for a New Generation Database
Relational Era“For all your structured data!”• Normalized, tabular
model• Application-
independent query• User control
Any Structure Era“For all your data!”• Schema-agnostic• Massive scale• Query and search• Analytics• Heterogeneous data • Faster time-to-results
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 18
No need to define up front
Matched to complex content and
metadata data modeling
Data is managed in its most
accessible, natural form
XML, JSON, RDF, geospatial
Flexible Data ModelSchema-agnostic, structure-aware
Result: Product content and data from
multiple sources available to be tailored to
any purpose and product
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 19
Search and QuerySearch to find answers in documents, relationships, and metadata
Automatic indexing of every data value, text and data
structure
Specialized indexes for data values (analytics, facets,
sorting), geospatial and triples
All updated in the context of ACID transactions to
ensure data integrity and real-time access
Accessible via fully programmable search API with full-
text search, type-ahead suggestions, facets, snippeting,
highlighted search terms, proximity boosting, relevance
ranking, and language support
JavaScript XQuery SPARQL
Rich Query
Capability
In-database
MapReduce
Full-text
Search
Semantic
Search
Geospatial
Search
Result: simplified architecture with a single
component for search and database
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 20
SemanticsEnterprise triple store, document store, and database combined
Store and query billions of facts and relationships
Leverage ontologies for domain and role specific
context access to data and documents
Efficient metadata management with relationships
to ontologies
Standards-based for ease of use and integration
– RDF, SPARQL, and standard REST
interfaces
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 21
Documents, data and triples provide complete picture of content
Semantics
Result: context to tailor information to your user’s role, activity and location
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 22
Scalability, Elasticity and CloudMassive enterprise scalability and elasticity
Scale horizontally in clusters on commodity
hardware to hundreds of nodes, petabytes of
data, and billions of documents
Process thousands of multi-document multi-
statement transactions per second
Start small and scale up or down to meet capacity
and performance demands without over-
provisioning or over-spending
Fully cloud enabled for automated deployment
and management on EC2
Leverage dynamic configurations with Tiered
Storage
D-NODE D-NODE
E-NODE E-NODE
D-NODE
Result: Enterprise-ready to power mission critical products
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 23
8. Develop, integrate and test
infrastructure & applications
1. Design infrastructure,
services & applications
With MarkLogic…
3. Define queries & Service
APIs
?
?
When something changes.... It’s no big deal
INFORMATION DELIVERY PLATFORM EXTENDED
Content and
Customers
Complete Picture of
Business
Metrics Driving Product
Development and
Sales
Company Data
Industry Data
Filings
Reports
Catalogs Lists
Authors Institutions Social Media + Usage
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 25
Use Case: Master Data
Foundational data for
digital products
Industry topology and
trends to drive innovation
User and content metrics
to drive product
development
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 26
Use Case: Enhance Digital Products
Present information based on
relationships
Go beyond traditional technology with
depth of content
Drive efficiency using semantic
approach to tagging
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 27
Use Case: Go Beyond Search
Concept instead of keyword search
Related content and information
drive the content discovery and new
interactions
– SNL40 continuous viewing
Dynamically tailored to the users
specific attributes or activity
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 28
Use Case: ‘Everything Else’
Tailor views and access to
information with multiple ontologies
Example: follow scientist from
research to the workbench to
conferences to publishing
Content delivery tailored to the
users role, activity and location
© COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. SLIDE: 29
Top 5 Requirements for Information Providers
Getting data IN fast isn’t the problem – it’s getting insights OUT Faster!
Data is complex – but users want complexity hidden!
Not everyone has permission to access all the data…
Repurpose, repurpose, repurpose. Repeat
Once you attract them – you must be reliable
1
2
3
4
5