IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John...

22
IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution Architect IBM jStart Emerging Technologies
  • date post

    23-Jan-2016
  • Category

    Documents

  • view

    216
  • download

    0

Transcript of IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John...

Page 1: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D

Session Number 1413A

John Kamins, CEO Veteris Technologies

Keyur Dalal, Solution Architect IBM jStart Emerging Technologies

Page 2: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

Topics to Address

• Today’s Life Sciences Industry Pain Points

• Veteris Technology Addressing Industry Needs

• Why IBM Technology/Expertise?

• Veteris Approach for Success

– Research Profile Automation Proof Of Concept (POC)

• IBM jStart Methodology

• Veteris POC Goals

• Solution Architecture

• POC Measurements Goals and Results

• POC Lessons Learned

• Veteris Next Steps

2

Page 3: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

Life Science Industry Need

Veteris Technologies was founded based on significant unmet needs in the Life Science industry and others for data driven decision-making through real time access to business and technical knowledge.

3

Page 4: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

Interface between groups is HARD due to:•Constantly growing and changing supplier technology portfolios•Organizational changes in pharmaceutical research

Interface between groups is HARD due to:•Constantly growing and changing supplier technology portfolios•Organizational changes in pharmaceutical research

Addressing Industry Pain Points

Researchers

Supplier Sales & Marketing

Pharma, Biotech, Academic, & Government Procurement

Veteris’ web-based software harnesses and deploys unstructured market data into predictive tools that help reveal relationships between the people, companies, and technologies that use, buy, and sell Life Science research products, allowing for better purchasing decisions.

4

Page 5: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

Researcher Profile

Consumable TechnologyProduct Profile

Optimal ProductSourcing

Overlap Analysis of Researcher Profile with Vendor Product Profile

Veteris Technologies SolutionPredictive Software for Vendor Consolidation

Page 6: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

1,000s of ResearcherWorkflow Profiles

1,000s of Research Productsfrom 1,000s of Vendors

Optimal VendorSourcing

Overlap Analysis of Researcher Profile with Vendor Product Portfolios

Vendor A

Vendor B

Vendor C

Predict and leverage volume through vendors with best product match to researcher needs

Veteris Technologies SolutionPredictive Software for Vendor Consolidation

Page 7: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

Veteris Technologies SolutionMaintain Researcher Profiles

• Profile Type

• Job Role• Science Area• Applications• Techniques• Stage• Therapeutic Area• Target Class• Drug Class• Bioprocessing• Species• Cell Type

7

Key Profile Terms

• Discipline

• Translational/Clinical Research

• Drug Discovery• Drug Development• Diagnostics• Public Health• Testing• Forensics.Human ID• Manufacturing,

Production & Process Development

• QA/QC

Domain Specific

Descriptors

Tier 1Tier 2 Tier 3

Page 8: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

Dr. Jane Researcher BiosensorPolyclonal antibodyMolecular biology….

Respiratory diseaseMicrochipGel electrophoresis….

Veterinary medicinePathobiologyInfectious disease….

Auburn UniversityDepartment of SciencesTel: (333) 222-1111 [email protected]

Discovery Verification Validation

Veteris Technologies SolutionAutomated Discovery & Validation of Researcher Profiles

Page 9: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

PubMed

Profile data from multiple papers

Institution/Lab Websites

Profile data & multiple journal references

Social Media Client Documents

Profile data from public profiles

Validate I.D. through overlap of profile data from multiple sources

Profile data from e-mails, tech notes, etc.

Veteris Technologies SolutionValidation of Researcher Profiles Multi-Source, Iterative Process

Page 10: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

PubMed

Profile data from multiple papers

Institution/Lab Websites

Profile data & multiple journal references

Social Media Client Documents

Profile data from public profiles

Validate I.D. through overlap of profile data from multiple sources

Profile data from e-mails, tech notes, etc.

Veteris Technologies SolutionIBM Content Analytics Technology

Content Analytics

Page 11: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

• IBM has a 50+ year History in Text Analysis and Discovery

• World Class Natural Language Processing (NLP) Technology

• Leading Edge NLP based Solutions (Watson)

• IBM jStart Text Analytics Expertise

11

Veteris Technologies SolutionWhy IBM?

Page 12: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

Clear understanding of business problem to be solved

Business and technical management commitment

Funding in place Right skills

identified and committed to project

Decision making context

Detailed schedule

Finalize scope Final technology

selections Deliverables Design

documents Project schedule Funding

approved for pilot implementation

Early prototyping Regular code

drops Testing

throughout cycle Constant

feedback from users

Modifications via change request

Solution deployment

Customer self-sufficiency

Reusable assets Other business

areas or technology

Solution definition Small team Define scope Map business

needs and technology

Deliverables Use cases Preliminary

design Tentative

schedule Initial sizing

Detailed Design

Solution Drivers &Boundaries

Requirements &Solution Scope

IterativeDevelopment

Deployment &Skills Transfer

Iterative Development, Continuous Testing

Constant feedback on Business & Technology

Research Profile Automation POC IBM jStart Engagement Process

Page 13: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

Research Profile Automation POCPurpose, Goals, and Structure

• Purpose

– Automate the creation and maintenance of researcher profiles from unstructured public data sources, such as PubMed and university department pages.

• Key Goals

– Track accuracy, completeness, and currency of crawled data for 1600 researchers

– Discover additional researchers, not currently tracked by Veteris

– Document strategy for expanding the list of data sources

– Evaluate feasibility of solution for millions of researchers

• Structure

– The pilot will be created and deployed at IBM, and can be demonstrated to Veteris and as such, will not be accessible outside the IBM network

– The project is structured in two phases and is designed to provide Veteris with an understanding of the value and usage of the technology early in the project, and to reduce risk

13

Page 14: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

Researcher Profile Automation POCResearcher Profile Fields

• A researcher’s profile is made up of three types of data

– Key data, which comprises the name

– Contact Fields, such as

• Address

• Email

• Phone

– Profile fields such as

• Profile type

• Tier 1/2/3

• Scientific Area

• Discipline

14

Page 15: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

IBM Content Analytics Document Server(UIMA Pipeline)

File System Crawler

Research Profile

Research Profile Model

ICA-LanguageWare Resource Workbench

Research Profile Automation POC Solution Architecture

E-Utilities

Predictive Software for Vendor Consolidation

Consumable Technology

Product Profile

JDBC UIMA CAS

Consumer

Deploy ModelCreate/Update’

Profiles

Seed PubMed Search (Researcher Name)

Veteris Taxonomy

Page 16: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

• Maintain profile data for 1600 researchers

• Track accuracy of data crawled

– % of names correctly identified

• Name + institution

• Name + at least one contact field

• Name + overlap of some profile data

• Track completeness of data crawled

– % of contact fields discovered per researcher

– % of fields verified (identical as base data)

– # of rows per researcher

• Track currency of data

– % of proposed updates that are newer information

Research Profile Automation POC Measurement Goals

Page 17: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

• Maintain profile data for 1600 researchers

– The number of known researchers' names found: 1044 (Strong, Medium and Weak candidates) – 65%

– The number of known researchers' names found: 925 (Strong and Medium) – 57%

• Track accuracy of data crawled (based on Strong, Medium candidates)

– % of names correctly identified

– Name + institution - 20.3%

– Name + at least one contact field –8.6%

– Name + overlap of some profile data - 47%

Research Profile Automation POC Measurement Goal Results

Page 18: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

• Track completeness of data crawled (based on Strong, Medium and Weak candidates)

– % of contact fields discovered per researcher

• 418 Emails were discovered for all known researcher

• 212 Phones were discovered for all known researcher

• 681 Addresses were discovered for all known researcher

– % of fields verified (identical as base data)

• Email – 8.6% (90 emails)

• Address – 0%

• Phone – 0%

• Profile type – 53.3% (557 profile types)

• Tier1 – 20% (216 tiers 1)

• Tier2 – 3.9% (41 tiers 2)

– # of rows per researcher – 23 (based on Strong and Medium candidates)

Research Profile Automation POC Measurement Goal Results (continued)

Page 19: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

• Track currency of data (based on Strong and Medium candidates)

– % of proposed updates that are newer information

• Email – 29% (270 emails)

• Address – 65% (605 addresses)

• Phone – 1.83% (17 phones)

• Organization – 69% (641 organizations)

• Profile type – 84.4% (871 profile types)

• Tier1 – 96% (890 tiers 1)

• Tier2 – 45% (423 tiers 2)

Research Profile Automation POC Measurement Goal Results (continued)

Page 20: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

• Validated the Viability of Research Profile Management Automation using Unstructured Library Sources

– Proved out an Automated and Scalable Process for Research Profile Management

– Can Leverage Industry Taxonomies of Research Terms to Build Robust Profiles

– Built a Solution Architecture that is Extendable to Include Additional Library Sources

– Discovered the Process Yielded New Previously Unknown Relationships between Researchers with Similar Profiles (~50000 in this case)

• PubMed / PubMed Central Content Source

– Abstract Limitations / Full Publication Copyright Complexities

– Good source for Domain Specific Descriptor Information

– Not an Optimal Source of Researcher Contact Information

– Searching Approach

Retrieve PubMed and PubMed Central documents by Domain Descriptors instead of name

Research Profile Automation POC Lessons Learned

Page 21: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

• Optimize and Expand Library Sources to Enhance Profile Content

– Optimize PubMed Content Retrieval

– Tune Extraction Model for Improve Recognition of Entities and Relationships

– Leverage Additional Library Source to Round Out Profile

• Retrieve PubMed and PubMed Central documents by Domain Specific Descriptors instead of name

• Mirror automated profile process for research “tool” products

• Used for automated alignment with researcher profiles in Veteris predictive software.

Research Profile Automation POC What is Next?

Page 22: IBM Content Analytics Helping to Drive Down Cost in Pharmaceutical R&D Session Number 1413A John Kamins, CEO Veteris Technologies Keyur Dalal, Solution.

Thank YouJohn Kamins, CEO Veteris [email protected]

Keyur Dalal, IBM jStart Emerging [email protected]