Visualization and text mining of patent and non-patent data

27
Introduction Applications Patent Analytics Medline Analytics Conclusions and future development Visualization and text mining of patent and non-patent data Anton Heijs Treparel Information Solutions Delft, The Netherlands http://www.treparel.com/ ICIC conference , Nice , France , 2008 Visualization and text mining Treparel

Transcript of Visualization and text mining of patent and non-patent data

Page 1: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Visualization and text mining of patentand non-patent data

Anton Heijs

Treparel Information SolutionsDelft, The Netherlands

http://www.treparel.com/

ICIC conference , Nice , France , 2008

Visualization and text mining Treparel

Page 2: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Outline

Introduction

Applications on patent and non-patent literature

Patent mining and visualization

Mining and visualization Medline

Conclusions and future development

Visualization and text mining Treparel

Page 3: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Introduction

Mining + visualization = visual analytics

I More data - less insightI Overview needed

Text mining approaches controlling over/under-fit

Rule based : difficult to control under fitNatural language processing : difficult to control over fitMachine learning : controls over-fit and under-fit

Visualization and text mining Treparel

Page 4: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Document clustering : finding patterns unsupervised

Unsupervised text mining : clustering

I Input is a vector representation of all documentsI Separate the data in its natural groups based on a

similarity metricI Output is a similarity matrix

Visualization and text mining Treparel

Page 5: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Document classification : finding patterns supervised

Supervised text mining : classification

I Input is a vector representation of all documentsI Determine a hyper plane which separates all the data

I Binary classification : classify to 1 classI Multi class classification : classify to multiple classesI Compound classification : combine multiple binary

classifiersI Output is a matrix with classification scores

Visualization and text mining Treparel

Page 6: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Text analytics

Text analytics is text mining and visualization

I Combined use of mining and visualization for analysisI Uses text mining to determine trends in the dataI Use visualization to make complex data understandable

Visualization and text mining Treparel

Page 7: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Applications on patent and non-patent literature

Text mining and visualization steps

I management and preprocessing dataI text mining (filtering the data in the pipeline)I mapping filtered data to geometryI rendering and interaction

Visualization and text mining Treparel

Page 8: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Patent analytics applications

I Patent search using classification and clustering algorithmsI Patent landscaping, spotting new opportunitiesI Patent ranking, utilization optimization and valuation

Visualization and text mining Treparel

Page 9: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Search by querie by class

Visualization and text mining Treparel

Page 10: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Tree map visualization of the patent classification

Visualization and text mining Treparel

Page 11: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Clustering of patents of 7 classes in Optical Recording

Visualization and text mining Treparel

Page 12: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Clustering of large patent sets

Visualization and text mining Treparel

Page 13: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Building a classifier

Visualization and text mining Treparel

Page 14: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Testing a classifier

Visualization and text mining Treparel

Page 15: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Visualization of the classifier performance

Visualization and text mining Treparel

Page 16: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Patent analytics application combining mining and visualization

Visualization and text mining Treparel

Page 17: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Correlation between patents

Visualization and text mining Treparel

Page 18: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Patents over time

Visualization and text mining Treparel

Page 19: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Patents visualization hierarchically, supervised, unsupervised andover time

Visualization and text mining Treparel

Page 20: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Text mining and visualizations applications on Medline

I Search for facts and relationships (using classification andclustering)

I Disambiguation (using classification)I Named entity recognition (using classification)I Concept and relationships discoveryI Knowledge discovery by integrating text mining and

visualization withI Data mining : mining table data from experimentsI Image mining : micro array analysisI Graph mining : path way analysis

Visualization and text mining Treparel

Page 21: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Mining and visualization Medline

Visualization and text mining Treparel

Page 22: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Search Medline by querie

Visualization and text mining Treparel

Page 23: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Classification of Medline documents to multiple topics

Visualization and text mining Treparel

Page 24: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Visualization of trends in Medline documents over time

Visualization and text mining Treparel

Page 25: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Conclusions

I Combination of text mining and visualization algorithmsenables more application solutions

I Multiple application solutions are a combination of a smallset of text mining algorithms

I Multiple application solutions require multiple coupled viewvisualizations to provide complete

Visualization and text mining Treparel

Page 26: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Future development

I Patent analytics will include patent ranking and valuationI Text mining will integrate with data mining , for instance for

patent portfolio managementI Stand alone applications and applications with work flow

support will advance

Visualization and text mining Treparel

Page 27: Visualization and text mining of patent and non-patent data

Introduction Applications Patent Analytics Medline Analytics Conclusions and future development

Trends Patterns Relationships

http://www.treparel.com/

Enabling you to learn more and see more

Visualization and text mining Treparel