Linguamatics – David Milward - ChemAxon

Post on 11-Feb-2022

12 views 0 download

Transcript of Linguamatics – David Milward - ChemAxon

Chemically Informed Text Mining

David Milward

Linguamatics

Chemaxon UGM Budapest 2013

© Linguamatics 2013

Click to edit Master title style Click to edit Master title style Linguamatics: Agile Text Mining

Boston Cambridge

I2E: agile, scalable, real-time NLP-based text mining

Fact extraction and knowledge synthesis

Fortune 500

Pharma/Biotech

Healthcare

Government Including 9

of the top 10

Including Kaiser Permanente

Including FDA

© Linguamatics 2013

Click to edit Master title style Click to edit Master title style

• Melting points for exemplified compounds in patents

Chemical Searching combined with Text Searching

Patent Data from IFI Claims Direct

© Linguamatics 2013

Click to edit Master title style Click to edit Master title style A Versatile Toolbox for Finding Information …

• Search for e.g. cancer and get synonyms and children:

• Malignant neoplasms, Malignant tumor …

• Leukaemia, Lymphoma, Astrocytoma … Terminologies

Linguistics

• e.g. microRNA: let-?\d+.* mirn?a?-?\d+.* Regular Expressions

Chemical Substructure

• Simultaneous processing of large numbers of items e.g.

• 500 genes from microarray experiment High Throughput

© Linguamatics 2013

Click to edit Master title style Click to edit Master title style … and Presenting it Efficiently

Identify Extract Synthesize Analyze

Pie Charts for drill down

© Linguamatics 2013

Trending over time

Interaction networks

Mind maps with clustering Clustered results table

RDF/BEL for network modelling

bp(apoptosis)p(C)taof(p(A))

microRNA(Q) kaof(p(D))p(D, P@Y)

p(B) catof(p(R))

catalytic activity

kinase activitymicroRNA abundance

phosphorylation at unspecified

tyrosine

protein abundance

direct causation

transcriptional activity

biological process

protein abundance

Click to edit Master title style Click to edit Master title style

© Linguamatics 2013

ChemAxon Integration

Mol files

Mol conversion with Filtering

5.7 g (56.7 mmol) of triethylamine in 20 ml methylene chloride are added dropwise at room temperature to a solution of 10 g (56.7 mmol) 2-hydroxymethyl-6-methylene-1,4-dithiepane

I2E Index

Name-to-Structure

I2E Query with Substructure/ Similarity

Click to edit Master title style Click to edit Master title style

YOUR APPLICATION

HERE!

I2E Server

Indexing tasks

Querying tasks

Class matching

Index/Query Publishing

Administration Tasks

I2E Client Pipeline Pilot Components

WSAPI Web View

Sample Web GUI

Client

I2E WSAPI

Serv

er

I2E Web Services API (WSAPI)

© Linguamatics 2013

Click to edit Master title style Click to edit Master title style I2E WSAPI Examples

© Linguamatics 2013

Click to edit Master title style Click to edit Master title style Thank You!

For more information…

Please visit our table or www.linguamatics.com

Webinars:

www.linguamatics.com/welcome/events/webinars.html

Contact: Phil Hastings

Email: phil.hastings@linguamatics.com

© Linguamatics 2013