CONTENT ANALYTICS UNLOCKS BUSINESS VALUE
FROM UNSTRUCTURED CONTENT
DELIVERING ANSWERS TO IMPORTANT QUESTIONS
VIA SEMANTIC TECHNOLOGIES
Business Need
A large percentage (estimated at 80% or more) of the information in a company
is maintained as unstructured content, which includes valuable assets such as
emails, customer correspondence, free-form fields on applications, wikis, blobs
of text in a database, content in enterprise content repositories, social media
posts, and messages of all kinds. Because this content lacks structure, it is
difficult to search and analyze it without extensive effort and automation
Structured vs Unstructured Data
Column Value
Patient Joe Brown
Date of Birth 02/13/1972
Date
Admitted
02/05/2014
Structured Data
High Degree of organization,
such as a relational database
Unstructured Data
Information that is difficult to organize
using traditional mechanisms
“The patient came in complaining of
chest pain, shortness of breath, and
lingering headaches…smokes 2
packs a day… family history of heart
disease…has been experiencing
similar symptoms for the past
12 hours….”
Big Content
• Beyond conventional Big Data, there exists a
tsunami of information in the big data
universe that has largely remained untapped
• Big Data has morphed into a world of
unstructured machine-generated data and
human-generated content that is referred to
as ‘Big Content.’ for example, chat logs,
emails, documents, sales and service notes,
CRM case notes, support tickets, weblogs,
social media feeds, and more
Content Analytics
Content analytics is the act of applying
business intelligence and business analytics
practices to this Big Content
Companies use content analytics software to
provide visibility into the amount of content
that is being created, the nature of that
content and how it is used. This contextual
value-adding information has remained
under-used due to lack of recognition and
inadequate technologies
Big Content
Content Analytics approach leverages multiple algorithms to draw patterns and
identify insights from unstructured data
Content analytics solution
processes textual data in ways
that help to search, discover,
and perform the same
analytics on textual data that
is currently performed on
structured data in a business
intelligence style of
application.
With Content Analytics
Solutions, unstructured data
can be used in ways that were
only previously attainable from
structured data sets
Analyze unstructured
content 1
Content Analytics delivers new
business understanding and
visibility from the content and
context of textual information. For
example, it can identify patterns,
view trends and deviations over
time, and reveal unusual
correlations or anomalies. It can
explain why events are occurring
and find new opportunities by
aggregating the voices of
customers, suppliers, and the
market.
Better business
understanding & visibility 2
Tool for reporting
statistics and deriving
actionable insights.
With Content Analytics,
solutions, we can define
many facets (or aspects) of
your data, with each facet
potentially leading to
valuable insights for various
users.
Content Analytics brings the
power of business intelligence
to the entire enterprise
information, not just structured
information(which is less than
20% of the entire enterprise
repository)
3
Content Analytics Solutions
Text Analytics or Natural Language
Processing were a set of linguistic, statistical,
and machine learning techniques that allow
text to be analysed and key information
extraction for business Integration.
However, it gave only answer to who, what,
where and when of a subject? The why was
left to subjective assessment only
Traditional Approach – Text Analytics
Evolution of Content Analytics
Contemporary Solution – Content Analytics
• Content Analytics (Text Analytics + Mining) refers to
the text analytics process plus the ability to visually
identify and explore trends, patterns, and
statistically relevant facts found in various types of
content spread across internal and external content sources.
• Content analytics distinctively adds the why and
the how and provides a comprehensive
understanding of the world around the subject
Identify meaning, trends, patterns, preferences, tastes,
from text for better business decision making
Understand the customers on a granular level primarily
due to to semantic and sentiment analysis
Extract more value from your social media community
by build a richer profile of each person on customer
database
Quickly identify trends amongst the customer base by
filtering and giving structure to the data
Reuse and curate content by analysing and curating
content from partner organisations and external sources
that are pertinent to the target market
Customer-centric marketing: As content analytics can
determine the interests of individual customers &
prospects, so, for each person the content that is most
relevant to them can be customized and personalised
propositions can be delivered
Content Analytics complements business intelligence to provide a more detailed
and accurate understanding of market and customer needs
€ Content
Analytics
1
2
3
4
5
6
Key Benefits of Content Analytics
• 90% of the world’s data was created in the last two years
• 5 million trade events per
second
Key Challenges of Content Analytics
Beyond Volume, Variety and Velocity is the Issue of Big Data Veracity
Velocity
Challenges of Content Analytics
• 1 Trillion connected devices generate 2.5 quintillion bytes data / day
• 12 terabytes of Tweets created daily
Volume
• With big data there is a tendency for errors to snowball e.g. user entry errors, redundancy and corruption all provide uncertainty & ambiguity to quality of data
Veracity
• Structured, unstructured, multimedia, text; varied content creation
• 80% of the world’s data today is unstructured
1
3
2
Variety 4
Content Analytics is used in many verticals and for various applications solving
varied business needs
Note: *This is just a representative list to showcase the capabilities of content analytics and not exhaustive
Usage of Content Analytics Solutions* Examples of Business Problems that can be
addressed
Market intelligence
Case management
Compliance
Risk scoring
“What features of our Banking Services
are most liked/hated by our customers?”
Financial Services
Scientific discovery
Bio-surveillance
Clinical trials
Healthcare and Life Sciences
Digital asset management
Content mining
Contextual advertising
“What caused this recent drop in sales
for Product X?”
Media and Advertising
Industry Solutions
Security
Intelligence
Digital library services
E-learning
Education
and Govt.
“Give me a media profile of Mr. X
including Trends, Quotes, Roles,
Contacts etc. “
“Which regulatory causes and sentences from Past have hindered the objective of universal education?”
Industry Overview
› Content Analytics solutions are usually evolutionary products
of Enterprise Content Management Solutions providers. These
solutions enable the management of business information
throughout the content lifecycle, from creation to disposition.
As a technical architecture, ECM consists of a platform or a set
of applications that interoperate but that can be sold and
used separately.
› Content Analytics and ECM market will grow from $5.1 billion
in 2013 to over $9.3 billion in 2017, at a CAGR of 16% over the
period.
›
› Leading providers of content analytics solutions are IBM,
Open text, EMC, Perceptive Software, Hyland, Microsoft and
Oracle. Several other new entrants such as Xerox, Alresco and
Newgen Software have also developed solutions which are
rated highly by industry experts and labeled as visionaries by
IT research firms such as Gartner.
• Content Analytics market includes key players that provide purpose-built and job-
aligned offerings, including case management, composite content applications
and customer communications management. Key assessment of leading players
in the Content Analytics market are detailed below.
Key Players
Strengths
Wide variety of content management and related capabilities, from content ingestion to archiving
Deep analytics and business intelligence tools
Weaknesses
IBM's greatest strengths also poses its greatest challenge: Breadth of its products may make it hard for customers to understand where to start or how to extend their current offerings
Strengths
• Open Text's relationship with SAP provides a firm foundation for expansion and has enabled it to command a strong position in markets where SAP is strong.
Weaknesses
• Complicated architecture
• High Pricing
• Poor after-sales support
Strengths
Extensive content management stack that includes most ECM elements
Customized industry solutions, specifically for the healthcare, life sciences, energy and engineering sectors
Weaknesses
• Only a limited and tactical solution in applicability
Strengths
• Strong product and solution capabilities
• Deep focus on vertical markets, specialized solutions for healthcare and higher education sectors
Weaknesses
• Increasing fragmentation of its product architecture and a lack of clarity about its road map
• Lack of interoperability
IBM Open Text EMC Perceptive
Software
Strengths Long and
extensive experience in developing content-enabled applications
Solution capability for Mobile and Cloud deployment
Weaknesses • Limited global
footprint with 85% of sales coming from NA
• Limited capabilities to manage sophisticated digital asset management requirements
Hyland
Trends
Increased focus on social media text
analytics as it is creating huge
amount of unstructured data.
Large scale changes in system
architecture as new data-centric
model and solutions will emerge.
Large data will live in persistent
memory and many CPUs/clients will
use shallow hierarchy
Significant benefits from Content
Analytics are likely to continue for at
least 5-10 years more before it
reaches the “Plateau of Productivity”
Future outlook for growth in
the Content Analytics space
will continue to remain bright
as businesses continue to
search for these solutions to
enhance their operational
efficiency and better
understanding of their
current and prospective
customers
Implications
Major Trends in Content Analytics
17
Analyzing Unstructured Content – Text Analytics Answering complex natural language questions requires more than keyword evidence
This evidence
suggests
“Gary” is the answer
BUT the system must
learn that keyword
matching may be
weak
relative to other
types of
evidence
18
Analyzing Unstructured Content – Content Analytics
CA approach leverages multiple algorithms to draw patterns and identify insights
Stronger evidence
can be much
harder to find and
score …
… and the evidence
is still not 100%
certain
Search far and wide Explore many
hypotheses Find judge evidence Many inference
algorithms
Top Related