Language Recognition… Searching with Precision Santa Clara, CA October 31, 2001 Julian Henkin Vice...

16
Language Recognition… Language Recognition… Searching with Precision Searching with Precision Santa Clara, CA October 31, 2001 Julian Henkin Julian Henkin Vice President, Vice President, Worldwide Customer Worldwide Customer Services Services LexiQuest, Inc. LexiQuest, Inc. Booth # 523 Booth # 523

Transcript of Language Recognition… Searching with Precision Santa Clara, CA October 31, 2001 Julian Henkin Vice...

Language Recognition…Language Recognition…Searching with PrecisionSearching with Precision

Santa Clara, CAOctober 31, 2001

Julian HenkinJulian HenkinVice President, Vice President, Worldwide Customer Worldwide Customer ServicesServicesLexiQuest, Inc.LexiQuest, Inc.

Booth # 523Booth # 523

           

                

Topics for DiscussionTopics for Discussion

Critical Nature of SearchCritical Nature of Search

Importance of LinguisticsImportance of Linguistics

Language RecognitionLanguage Recognition

Case StudiesCase Studies

Critical Nature of SearchCritical Nature of Search

““At least one-third of your visitors are going to use the search At least one-third of your visitors are going to use the search function as soon as they enter your site.” function as soon as they enter your site.”

- Improving Your Site’s Search, - Improving Your Site’s Search, The Information StandardThe Information Standard, August 11, 2000, August 11, 2000

““On average, professional users spend 11 hours per week looking On average, professional users spend 11 hours per week looking for information. 71% said they could not find what they were for information. 71% said they could not find what they were looking for.” looking for.” - “Information Management Software,” - “Information Management Software,” Lazard Freres & Co. LLCLazard Freres & Co. LLC, February , February 20012001

““Ultimately, the return on investment (ROI) of corporate information Ultimately, the return on investment (ROI) of corporate information systems cannot be solely derived from the cost of building systems cannot be solely derived from the cost of building populating and maintaining these systems. True ROI also reflects populating and maintaining these systems. True ROI also reflects the ability of all classes of users to effectively use the information.”the ability of all classes of users to effectively use the information.”

- “Looking for a Lifesaver- “Looking for a Lifesaver?”?”, , KM MagazineKM Magazine, August 1999, August 1999

Challenges with Today’s SearchChallenges with Today’s Search

Traditional and advanced methods (key word, Boolean Traditional and advanced methods (key word, Boolean searches, statistical and probability algorithms, concept searches, statistical and probability algorithms, concept agents, neural networks and pattern recognition) are agents, neural networks and pattern recognition) are limited in their ability to retrieve accurate results:limited in their ability to retrieve accurate results:

Not intuitive for typical user so full breadth of capability is rarely Not intuitive for typical user so full breadth of capability is rarely utilizedutilized

Do not provide any level of “understanding” of the text or of the Do not provide any level of “understanding” of the text or of the concepts represented by the queries. concepts represented by the queries.

Search is based solely or largely on the comparison of the Search is based solely or largely on the comparison of the character strings in both queries and text. character strings in both queries and text.

Results often include a lot of “noise” (irrelevant results) and Results often include a lot of “noise” (irrelevant results) and “silence” (accurate results are not found). “silence” (accurate results are not found).

What if you don’t know what you are looking for? What if you don’t know what you are looking for? 

Importance of LinguisticsImportance of Linguistics

Linguistic-based systems are knowledge-sensitive: the Linguistic-based systems are knowledge-sensitive: the more information there is in their “dictionaries”, the better more information there is in their “dictionaries”, the better the qualitythe quality: :

Natural Language interface is very intuitive for users, lets the Natural Language interface is very intuitive for users, lets the system do the worksystem do the work

Up to a 400% improvement in performance over traditional Up to a 400% improvement in performance over traditional search engines (greater relevance, and precision)search engines (greater relevance, and precision)

Can deliver multilingual and cross-lingual accessCan deliver multilingual and cross-lingual access

How Does Language Recognition Work?How Does Language Recognition Work?

CONCEPTSOrganizes concepts regardless of their language i.e., Table (Fr), Table (Eng), Mesa (Sp), Tavola (It)

SEMANTICUnderstands the meanings of words i.e., book=to register for a future activity vs. book= set of bound

sheets of paper

SYNTAXUnderstands a sentence’s or phrase’s structure and the “roles” of words i.e., subjects, verbs, objects; “to book” vs. “a book”

MORPHOLOGYWord structure. Recognizes words (simple and compound) i.e., “to buy”, “bought”

The Ladder of Language

1. P

ers

on

aliz

ati

on

(Sha

ring)

2. Codification(Capture,

Structured Storage)

3. Discovery (Search, Retrieval) 4

. Crea

tion

Inn

ov

atio

nLexiQuest Mine

LexiQuest Categorize

LexiQuest Guide

LexiQuest Respond

5. C

ap

ture

Mo

nito

r

“Knowledge Management is the collection of processes that govern the creation, dissemination, and utilization of knowledge.”

“Knowledge is one, if not THE, principal factor that makes personal, organizational, and societal intelligent behavior possible.”

“Organizations that have adopted this position (Chief Knowledge Officer) include Hoffman-LaRoche, GE Lighting, Xerox PARC, and several consultancies, including Ernst &Young, Gemini, and McKinsey”

Five KM ActivitiesFive KM Activities

Enterprise Document

Databases, Web sites or

Repositories

Domain 1Limited amount of

content

Domain 3Significant

amount and depth of content

Users w

ho browse via a

directory structure/taxonomy.

Many S

earch Engines now

leverage a taxonom

y: improved

accuracy

Users who know what they are looking for and prefer using a search

engine.

Use

rs w

ho c

olle

ctiv

ely

ask

the

sam

e na

rrow

set

of

ques

tions

ove

r an

d ov

er

agai

n

LexiQuest Mine

Lex

iQu

est

Res

po

nd

LexiQ

uest C

atego

rize

LexiQuest Guide

Users who don’t know what they are looking for and

need concepts illuminated. (Research)

Domain 2Limited amount of

content

Suite of CapabilitiesSuite of Capabilities

User ExperienceUser Experience

“Who are the main ISP’s in the Far East?”

Linguistic Analysis

Accurate Results: Taiwanese Access Service Provider

Mine: A Research ToolMine: A Research Tool

Electronic CommerceElectronic Commerce NEAR NEAR FraudFraud

Mine’s Native SearchMine’s Native Search

89 Documents89 Documents

Guide’s Linguistic ExpansionGuide’s Linguistic Expansion

““consumers’ fraud consumers’ fraud protection online” – 21 protection online” – 21

documentsdocuments

““Swindle” returns this Swindle” returns this relevant documentrelevant document

Pharmaceutical ExamplePharmaceutical Example

“Cipro” expands to ciprofloxacin hydrochloride

Pharmaceutical ExamplePharmaceutical Example

“What antibiotic treats anthrax”

Antibiotic expansions include ciprofloxicin, Cipro, ciprofloxicin hydrochloride

Quantitative ResultsQuantitative Results

60%

50%

40%

30%

20%

10%

0%

400% more Accurate than Current Solutions400% more Accurate than Current Solutions

5 15 20number of answers retrievedCustom LexiQuest Guide

Search Engine 10 30

% of correct answers of all answers retrieved

Ensures all relevant information is retrieved

Reduces “noise” from irrelevant results

Julian HenkinVice President, Worldwide Customer Services641 Lexington Ave, 30th FloorNew York, NY 10022212-752-2750 [email protected]

Booth # 523Booth # 523

www.lexiquest.com