REALIZING BUSINESS VALUE FROM OPEN SOURCE DATA
AND OPEN SOURCE INTELLIGENCE
Presented by: Chris Morgan
http://bit.ly/data-vending
DATA AND ART (PRIMER)
Providing value on the potential of bad news to serve out a bag of salty potato chips
harnessing the power of open data and sentiment
Data Intelligence
Operational Lens
Intelligence is information that has been transformed to meet an operational need
Intelligence
Intelligence CycleNo matter what methodology you use…
intelligence analysis is an iterative process.
• Provide value to the organization – turn data into intelligence using an “operational lens”
• Ensure cyclical feedback occurs during collection, processing, analysis, and consumption
• Validate that a particular network is the right source of data for the questions you need answered
Open Source Analysis Goals
Common Pitfalls
Analyzing What Instead of Why
The important thing is often not what people are saying… but why they
are saying it.
Common Pitfalls
Using the Wrong Analysis Tools
Reporting tools rarely help dig into the why. Many common tools, reports, and metrics are misleading:– Word clouds atomize message context– Sentiment metrics are often highly inaccurate– Information in aggregate hides more than it
reveals
Use Case
Sentiment Analysis
http://bit.ly/ikanow-and-r
Enron Sentiment Analysis
Data source
~500,000 Publically available Enron emails
http://bit.ly/ikanow-and-r
Enron Sentiment Analysis
Hypothesis
Utilize Sentiment analysis as first order process to prioritize and streamline the
overall analysis process
http://bit.ly/ikanow-and-r
Enron Sentiment Analysis
Caveats
Sentiment was only attributed to the sender Not a complete representation of an organizations
email corpus Counteraction of uneven coverage was estimated Not a full analysis of the set of information
(objective was to use sentiment analysis as a reduction technique)
http://bit.ly/ikanow-and-r
Workflow• Data Ingestion Process– Extraction of entities, events, facts and some
basic statistics• Aggregation and Reduction– Aggregation of keywords with sentiment from
each email– Average sentiment score– Follow on aggregation by email address of the
sender over a given week (average sentiment score)
• Visualize and Analyze– Imported into Infinit.e and R for visualization
http://bit.ly/ikanow-and-r
• Horizontal Bar– Positive sentiment =
Green– Negative sentiment =
Red
• Chart on Left– Positive sentiment =
Green– Negative sentiment =
Red
• Chart on Right– Heuristic – weeks with
abrupt negative shifts indicated problems in organization
– Positive sentiment = Blue
– Negative sentiment = Red
One email sender’s Weekly Average Sentiment across time
Workflow
Workflow
close-up snapshot of sub-set of 20 individuals email average sentiment score over time
Individual analysis based on the reduction of the
information by the sentiment analysis
process
Workflow
Findings• Indicators and Additional Analysis– 801 weeks highlighted out of 11,500 weeks as
important for further investigation– Keywords found could further be used to
investigate statistically the 801 weeks highlighted for manual review
– Individual evaluation of emails highlighted through a reduction process (case construction)
– Pipeline created for further analysis
Lessons Learned
1. Drastically reduced the timeline necessary for case
construction
Lessons Learned
2. Multiple contexts for this type of technique
Intelligence Analysis
E-Discovery
Brand management Social Media
Analysis
Lessons Learned
3. Negative shifts were only investigated, analysis of the positivity side for other use cases could be applied to different questions easily
Lessons Learned
4. R and Infinit.e provide a interesting technology
integration for evaluating and reducing unstructured data
Top Related