II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic...

44
Analyzing Patent Full-Text A Study 1 April 7, 2014 Analysing Patent Full Text Richard Gynn - LexisNexis

Transcript of II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic...

Page 1: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Analyzing Patent Full-Text

A Study1 April 7, 2014

Analysing Patent Full TextRichard Gynn - LexisNexis

Page 2: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Analyzing Patent Full-Text

A Study2 April 7, 2014

Agenda

1) Full Text Availability

2) Analyzing full text

- Discussion/considerations

- Big picture analysis

- Detailed analysis - Study

3) Conclusions

Full Text content available from vendors has evolved to a point

where most of the top publishing authorities are readily available.

Page 3: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Analysing Patent Full Text. Availability

Page 4: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Full Text Availability – Top 10 Publishing Authorities (available from most big vendors)

April 7, 2014Analyzing Patent Full-Text

A Study4

China, Korea, Japan are not

the big deal they used to be!Text can be available to analyse in English

Page 5: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Full Text Availability – Authorities available from at least one vendor

April 7, 2014Analyzing Patent Full-Text

A Study5

Page 6: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Full Text Availability by volume- > 100k publications

April 7, 2014Analyzing Patent Full-Text

A Study6

0

5

10

15

20

25

JP US

CN

DE

EP

KR

GB

FR

WO

CA

AU

TW SU ES

AT

SE IT

RU

CH

NL

BE FI

BR

DK IN NO PL IL

DD ZA

MX

HU PT

CS

AR IE NZ

CZ

GR

Mil

lio

ns

Page 7: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Full Text Availability by volume- > 100k publications

April 7, 2014Analyzing Patent Full-Text

A Study7

0

5

10

15

20

25

JP US

CN

DE

EP

KR

GB

FR

WO

CA

AU

TW SU ES

AT

SE IT

RU

CH

NL

BE FI

BR

DK IN NO PL IL

DD ZA

MX

HU PT

CS

AR IE NZ

CZ

GR

Mil

lio

ns

31 of these 39 are currently

available from vendorsAccount for vast majority of total volume

Page 8: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Full Text Availability by volume - < 100k publications

April 7, 2014Analyzing Patent Full-Text

A Study8

0

10,000

20,000

30,000

40,000

50,000

60,000

70,000

80,000

90,000

100,000

HK

YU

RO SG TR

MY

LU BG

PH

UA

TH CL

EA ID HR SK

CO SI

VN PE

UY

OA

EG IS EC

Page 9: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Full Text Availability by volume - < 100k publications

April 7, 2014Analyzing Patent Full-Text

A Study9

0

10,000

20,000

30,000

40,000

50,000

60,000

70,000

80,000

90,000

100,000

HK

YU

RO SG TR

MY

LU BG

PH

UA

TH CL

EA ID HR SK

CO SI

VN PE

UY

OA

EG IS EC

Much smaller amounts currently

available from vendors ~ 300,000If all were to become available would add about 1.5% to full text

that is currently available, e.g. equivalent to Spain or Taiwan

Page 10: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Full Text Availability by volume - < 10k publications

April 7, 2014Analyzing Patent Full-Text

A Study10

0

1,000

2,000

3,000

4,000

5,000

6,000

7,000

8,000

9,000

10,000M

A

AP

VE

EE

LV GT

CU LT

MD CR

PA CY

DO

MC

ZM

ZW SV

SM JO PY

GE

DZ

KE

MT

HN

MW N

I

ME TJ

GC

BO

MN

BA KZ

BY

TT

Page 11: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Full Text Availability by volume - < 10k publications

April 7, 2014Analyzing Patent Full-Text

A Study11

0

1,000

2,000

3,000

4,000

5,000

6,000

7,000

8,000

9,000

10,000M

A

AP

VE

EE

LV GT

CU LT

MD CR

PA CY

DO

MC

ZM

ZW SV

SM JO PY

GE

DZ

KE

MT

HN

MW N

I

ME TJ

GC

BO

MN

BA KZ

BY

TT

One currently available from vendorsIn total these would add about 0.1% to full text that is currently available

Page 12: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Analyzing Patent Full-Text

A Study12 April 7, 2014

• Are we nearly there yet?• There’s a lot of full text available to make use • Most vendors have a significant volumes• Rapidly diminishing returns for each authority added

Full Text Availability

Bringing You The World• We are already in a good place

• In terms of % availability at least

Page 13: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Analysing Patent Full Text. Discussion/considerations

Page 14: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Analyzing Patent Full-Text

A Study14 April 7, 2014

Full Text – What Is It?

Full-text – what is it?• Everything of course?!

― …will concentrate on:

Page 15: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Considerations

April 7, 2014Analyzing Patent Full-Text

A Study15

There’s clearly a lot out

there, so why don’t we see

so much analysis of patent

full text?

Page 16: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Analyzing Patent Full-Text

A Study16 April 7, 2014

Considerations - Language• Can only compare like for like in same language

…non-Latin character issues too• Noise – Patent full-text likes to state things like

…the complete opposite of what it’s about!

Considerations - Language

How I might introduce myself…If I was a patent!

나는사람들이밥, 앤드류, 데이브앨런같은이름이, 이름이. 나는밥, 앤드류, 데이브나앨런아니에요. 내이름은리처드입니다

I have a name, people have names like

Bob, Andrew, Dave and Alan. I’m not

Bob, Andrew, Dave or Alan.

My name is Richard

私は人々がボブ、アンドリュー、デイブとアラ私は人々がボブ、アンドリュー、デイブとアラ私は人々がボブ、アンドリュー、デイブとアラ私は人々がボブ、アンドリュー、デイブとアランのような名前を持っている、名前を持っていンのような名前を持っている、名前を持っていンのような名前を持っている、名前を持っていンのような名前を持っている、名前を持っています。私はボブ、アンドリュー、デイブかアランます。私はボブ、アンドリュー、デイブかアランます。私はボブ、アンドリュー、デイブかアランます。私はボブ、アンドリュー、デイブかアラン

ないよ。私ないよ。私ないよ。私ないよ。私の名の名の名の名前はリチャードです前はリチャードです前はリチャードです前はリチャードです

Page 17: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Considerations

Other Considerations:

• Massive amounts of data

– Time?

– How deal with ?

• Will it contain anything useful?

/benefit outweigh effort?

April 7, 2014Analyzing Patent Full-Text

A Study17

• Tools

– Big picture?

– Details?

Page 18: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Big Picture - Landscape Analysis

April 7, 2014Analyzing Patent Full-Text

A Study18

Big picture, topographic mapping (Discussion)

Here more full text could provide:• Broader country analysis (often full-text not available)• More consistency across authorities – e.g. more claims

― Compare like for like, e.g. not claims, title & abstract against title

• Full text more useful for details

• Themes/commonalities easier to

find using claims, title, abstract

• Whilst useful, vast majority of

landscape analysis done elsewhere,

…i.e. details rather than big picture

Page 19: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Analysing Patent Full Text. Study

Page 20: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

The Details - Study

Detailed analysis – looking for what?• New/emerging, different• Competitive/market comparisons• Strength, weakness, opportunity, threat

April 7, 2014Analyzing Patent Full-Text

A Study20

What can I find using the full

text that I couldn’t using title,

abstract and bibliography?

Page 21: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

The Details - The Technology

April 7, 2014Analyzing Patent Full-Text

A Study21

Terahertz analysis, e.g. imaging, spectroscopy?Terahertz radiation - between Infra-red and microwave

Page 22: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

The Details - The Search

April 7, 2014Analyzing Patent Full-Text

A Study22

• Broad Strategy― Analysis IPCs + Terahertz

Radiation Synonyms

― Keyword Terahertz

Imaging & Spectroscopy

5,955 documents/3,365 families

Page 23: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Study - PatentOptimizer

Analyzing Patent Full-Text

A Study23 April 7, 2014

Analysis Details:

• Small/emerging areas of 6-7 families

• Look at terms & phrases, parts, claim

elements (all numbers represent families)

PatentOptimizer™ Analysis of EP, PCT & US results• English Translations

Page 24: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

PatentOptimizer – Terms & Phrases

April 7, 2014Analyzing Patent Full-Text

A Study24

Diagnosis - General

Page 25: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

PatentOptimizer – Terms & Phrases

April 7, 2014Analyzing Patent Full-Text

A Study25

Not found in Title, Abstract (or claims) –

All From Spectral Image IncLearned – Something seemingly unique to them

SAME DOCUMENTS

Page 26: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

PatentOptimizer – Terms & Phrases

April 7, 2014Analyzing Patent Full-Text

A Study26

Not found in Title, Abstract (or claims) – All

monitoring vitamin K concentration in bloodLearned – A more recent (emerging?) use

Diagnosis - General

Page 27: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

PatentOptimizer – Parts

April 7, 2014Analyzing Patent Full-Text

A Study27

Remote monitoring, e.g. of Bluetooth® headset userLearned – Interesting, but not massively relevant result, would like to

investigate applications further

Diagnosis - general

Page 28: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

PatentOptimizer – Claim Elements

April 7, 2014Analyzing Patent Full-Text

A Study28

Looking for infiltration or extravasation

during intravenous infusionLearned – New possibly interesting area, seemingly

dominated by one organisation

Diagnosis – general

A61M – introducing remedies

Page 29: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Study - VantagePoint

Analyzing Patent Full-Text

A Study29 April 7, 2014

Analysis Details:

• Data Statistics

• Terms uniquely appearing in full text

• Highly occurring terms used in small

numbers of documents

• Investigate terms unique to 2013

priority onward

Vantage Point Analysis of TotalPatent full text results• English Translations

Page 30: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Vantage Point - Statistics

Very low percent of terms and words, available for analysis are actually in the title and abstract

Title & Abstract

• 42,614 words & phrases

• 16,251 words

Claims

• ~132k words and phrases not in Title or Abstract

• ~44k words in Title or Abstract

Full-text

• ~1.3M unique words & phrases

• ~650k unique words

April 7, 2014Analyzing Patent Full-Text

A Study30

Page 31: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Vantage Point – Terms only appearing in full text 2013 onwards

April 7, 2014Analyzing Patent Full-Text

A Study31

Page 32: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Vantage Point – Terms only appearing in full text 2013 onwards

April 7, 2014Analyzing Patent Full-Text

A Study32

Detection of tetracycline drug –

concern in resistance to antibioticsLearned – New area (clearer language in full-text)

optical investigation

Page 33: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Vantage Point – Terms only appearing in full text 2013 onwards

April 7, 2014Analyzing Patent Full-Text

A Study33

Looking for gas hydrates (fracking)Learned – New area (uncovered by more consistent

repetition in full text)

general investigation,

sampling

Page 34: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Analysing Patent Full Text. Conclusions

Page 35: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Findings

April 7, 2014Analyzing Patent Full-Text

A Study35

• Full text useful

• Claims less so (in this case)

Most words and phrases in the “full text”, did not appear in Abstract & Title

• Text mined wasn’t necessarily applications, but pointed towards

• More consistent repetition in full text

Helped mainly find new/niche applications

• Probably wouldn’t have found other ways

Interesting companies & technologies to look at further

Page 36: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Conclusions

Conclusions (Noise and huge amounts of info):

• Background did not really come in as an issue

• Used English translations to avoid language issues

• Most noise was from search results

• My judgement – about 50% proved somewhat

interesting upon further investigation

• Can this be automated/put into a process?

• 4/5+ family groupings seems to be about the

sweet spot

April 7, 2014Analyzing Patent Full-Text

A Study36

Page 37: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

What More?

What more?

Further this:

• Life Sciences

• Define processes

Dedicated machine?

• Detailed full-text analysis

Study analysis of parts

• Sellers, inventors, manufacturers etc.

April 7, 2014Analyzing Patent Full-Text

A Study37

Easier than expectedMore possible & better timescales

Page 38: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Questions

April 7, 2014Analyzing Patent Full-Text

A Study38

Page 39: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

Analysing Patent Full Text. Study – Additional Examples

Page 40: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

PatentOptimizer – Terms & Phrases

April 7, 2014Analyzing Patent Full-Text

A Study40

2 of 6 have tattoo in Abstract OR Title

(same if include claims)Learned – THz radiation can be used for tattoo removal

Diagnosis, surgery - General

Page 41: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

PatentOptimizer – Terms & Phrases

April 7, 2014Analyzing Patent Full-Text

A Study41

Not found in Abstract & Title

(One claimed -Optical Diagnostics)

Determining microorganism

presence/kind

Page 42: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

PatentOptimizer – Claim Elements

April 7, 2014Analyzing Patent Full-Text

A Study42

SAME DOCUMENTS

Identifying/determining antimocrobial

resistance of Burkholderia CepaciaLearned – Smaller more niche areas?

Page 43: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

PatentOptimizer – Terms & Phrases

April 7, 2014Analyzing Patent Full-Text

A Study43

Not found in Title, Abstract (or claims) – All Some

detectors, some looking for heavy metal contaminationLearned – Some areas to investigate further?

Page 44: II-SDV 2014 Analysing Patent Full Text – Comparison against analysis of abstract and bibliographic data, and lessons learned (Richard Gynn - LexisNexis, UK)

PatentOptimizer – Claim Elements

April 7, 2014Analyzing Patent Full-Text

A Study44

Glucose Monitoring – Far-IR (5/7 have in Abstract & Title)Learned – Not much more than from Title & Abstract

Blood measurement