T H O M S O N S C I E N T I F I C
Data Visualization Tools – An UpdatePresented at the 2007 PIUG Meeting, Symposium for Richard Kurt
Holly Chong-Williams
Manager, Patent Offices and Specialty Accounts
Thomson Scientific Corporate Markets
Copyright 2007 The Thomson Corporation2
T H O M S O N S C I E N T I F I C
Agenda
• Definitions of Data Visualization
• Benefits of Data Visualization and Applications
• Converting Information to Intelligence
• Commonly available software tools
• Value-add tools provided by commercial search engines
• Text-mining & Data-mining tools
• Fun Data Visualization Sites
• Conclusions
Copyright 2007 The Thomson Corporation3
T H O M S O N S C I E N T I F I C
Holly’s First Data Visualization Experience, circa 1992
• SLA and Orbit User Days Presentation – “The Way of the
Warrior”, an analysis of Japanese technology trends.
• Downloaded Japio database IPC and publication year
data using the Orbit GET command.
• Uploaded the data into Lotus 123.
• “Eyeballed” the lists to generate rankings.
• Generated “OHP”s (overhead projector slides) graphs and
colored them using translucent color sheets.
Copyright 2007 The Thomson Corporation4
T H O M S O N S C I E N T I F I C
Copyright 2007 The Thomson Corporation5
T H O M S O N S C I E N T I F I C
Copyright 2007 The Thomson Corporation6
T H O M S O N S C I E N T I F I C
Copyright 2007 The Thomson Corporation7
T H O M S O N S C I E N T I F I C
Definitions of Data Visualization (aka scientific visualization, information visualization or visual analytics)
• “Visualization is the graphical presentation of information,
with the goal of providing the viewer with a qualitative
understanding of the information contents.”Matthew Ward, Worcester Polytechnic Institute,
http://web.cs.wpi.edu/~matt/courses/cs563/talks/datavis.html
• “The process of finding meaningful patterns in data and
creating a visual representation.”Richard Kurt, SLA Annual Meeting in Los Angeles, 2002
Copyright 2007 The Thomson Corporation8
T H O M S O N S C I E N T I F I C
Data visualization per a four
year old…
Copyright 2007 The Thomson Corporation9
T H O M S O N S C I E N T I F I C
Data Visualization - Benefits
• Map/representation is inherently easier to understand
• “A picture is worth a thousand words”
• Provides organization to data
• Condenses data into knowledge
• Complex and large sets of information can be more easily
presented to decision makers - actionable
• Large data sets can be explored quickly
• New insights can be gained
Copyright 2007 The Thomson Corporation10
T H O M S O N S C I E N T I F I C
Data Visualization Applications
• IP assessments
• Competitive intelligence
• Technology assessments
• R&D planning
• Benchmarking
• Trending and Forecasting
• Profiling
Copyright 2007 The Thomson Corporation11
T H O M S O N S C I E N T I F I C
Information vs. Intelligence
• Information
• Who
• Inventor
• Patent Assignee
• What
• Abstract
• IPC
• When
• Application Date
• Publication Date
• Where
• Country
• Intelligence
• How does it work?
• Why are they doing it?
• What does it mean?
• How does it fit into the “big picture?”
Copyright 2007 The Thomson Corporation12
T H O M S O N S C I E N T I F I C
Information vs. Intelligence
D
a
t
a
I
n
f
o
r
m
a
t
I
o
n
I
n
t
e
l
l
i
g
e
n
c
e
Organize Analyze
Copyright 2007 The Thomson Corporation13
T H O M S O N S C I E N T I F I C
Organize – Online Tools
• Simple Tools available from commercial search services
• Dialog RANK
• QuestelGET
• STN ANALYZE
• Microsoft Excel
• Create charts and summaries
Copyright 2007 The Thomson Corporation14
T H O M S O N S C I E N T I F I C
OrganizeUsing RANK
Copyright 2007 The Thomson Corporation15
T H O M S O N S C I E N T I F I C
Advantages of Online Tools
• No additional software
• Relatively easy to use
• Inexpensive
• Usually a few cents (or less) per document RANKed
• Results can be easily imported into more sophisticated
tools
• Microsoft Excel
• Microsoft Access
Copyright 2007 The Thomson Corporation16
T H O M S O N S C I E N T I F I C
XML
• eXtensible Markup Language
• Collect information once and reuse it in a variety of ways
Copyright 2007 The Thomson Corporation17
T H O M S O N S C I E N T I F I C
XML
Data
Web Page Excel Spreadsheet Portal
Access Database
RSS
Word Document
Text/Data Mining Application
Copyright 2007 The Thomson Corporation18
T H O M S O N S C I E N T I F I C
Information Derived from XML Output
Copyright 2007 The Thomson Corporation19
T H O M S O N S C I E N T I F I C
Alternate View of Publication Country for Top Assignees
Copyright 2007 The Thomson Corporation20
T H O M S O N S C I E N T I F I C
Advantages of XML Download plus Microsoft Excel & Word
• Both Excel & Word are readily available
• Organize & Analyze
• Relatively easy to use
• Pay once for data download in XML, then analyze as
necessary at no additional cost
• Sophisticated graphing of results in Excel
• Enhanced readability in Word
Copyright 2007 The Thomson Corporation21
T H O M S O N S C I E N T I F I C
Analyzing
• Data Mining
• Process of analyzing data from computers and other large relational
databases (i.e. structured data)
• Text Mining
• Process by computer of extracting information from unstructured (i.e.
natural language) text and making links to form new facts or hypotheses
• Information Visualization
• A visual approach to data analysis to reveal insights or unexpected
relationships through lists, co-occurrence matrices, maps etc
Copyright 2007 The Thomson Corporation22
T H O M S O N S C I E N T I F I C
Stand Alone Products
• Aureka
• ClearForest
• Leximappe
• Matheo
• STN AnaVist
• Temis
• Thomson Data Analyzer
• VantagePoint
• VX Insight
Copyright 2007 The Thomson Corporation23
T H O M S O N S C I E N T I F I C
Typical business questions
• What is new in this technology space?
• What is my IP position relative to my competitors?
• What new competitors are there?
• What collaborations are taking place?
• How are my competitors organized?
• Is the technology easy to penetrate?
Copyright 2007 The Thomson Corporation24
T H O M S O N S C I E N T I F I C
- See relative year trends of companies
- Spot entrants / leavers
Who is leading the field?
Copyright 2007 The Thomson Corporation25
T H O M S O N S C I E N T I F I C
-Map shows similarity of companies based on technology
-Spot companies who find new applications
-Avoid areas of high concentration
Who are potential targets?
Copyright 2007 The Thomson Corporation26
T H O M S O N S C I E N T I F I C
-Matrix of companies vs companies
-Spot collaborations
Who do they work with?
Copyright 2007 The Thomson Corporation27
T H O M S O N S C I E N T I F I C
Mining the Results of a Chemical Structure Search
Copyright 2007 The Thomson Corporation28
T H O M S O N S C I E N T I F I C
Structure Search in DCR
Copyright 2007 The Thomson Corporation29
T H O M S O N S C I E N T I F I C
MAP DCR Results to DWPI
102 individual compounds in DCR
416 patent records in DWPI
Copyright 2007 The Thomson Corporation30
T H O M S O N S C I E N T I F I C
Import Results into Analysis Tool
Copyright 2007 The Thomson Corporation31
T H O M S O N S C I E N T I F I C
Top IPC’s
Copyright 2007 The Thomson Corporation32
T H O M S O N S C I E N T I F I C
Most Frequently Cited (Key) Patents
Copyright 2007 The Thomson Corporation33
T H O M S O N S C I E N T I F I C
Most Frequently Cited (Key) Assignees
Copyright 2007 The Thomson Corporation34
T H O M S O N S C I E N T I F I C
Copyright 2007 The Thomson Corporation35
T H O M S O N S C I E N T I F I C
Copyright 2007 The Thomson Corporation36
T H O M S O N S C I E N T I F I C
It’s Not All About Patents
• Important information can be mined from
• Other technical databases
• Science Citations Index
• Inspec
• Biosis
• Embase
• News and business publications
• Gale Group PROMT
• Dialog NEWSROOM
Copyright 2007 The Thomson Corporation37
T H O M S O N S C I E N T I F I C
Conclusions
• Simple data visualization tools already exist on your
desktop.
• You can determine how easy/hard you want to work to get
the analysis you want/need.
• Keep it simple – in some ways it is still 1992 and there is
no “poof”
Copyright 2007 The Thomson Corporation38
T H O M S O N S C I E N T I F I C
Mmmm, data…Special thanks to Bob Stewart (TS Content
Specialist) for his help and advice
For Richard…
Top Related