An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled...
Transcript of An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled...
![Page 1: An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documents Tokenizes documents into](https://reader034.fdocuments.net/reader034/viewer/2022042917/5f5afa18c1d08c261e3fc9dd/html5/thumbnails/1.jpg)
An OverviewPresented December 8, 2005
![Page 2: An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documents Tokenizes documents into](https://reader034.fdocuments.net/reader034/viewer/2022042917/5f5afa18c1d08c261e3fc9dd/html5/thumbnails/2.jpg)
What is it?What is it good for?Who uses it?What does it look like?Where can I get it?Requirements,Terms and ConditionsCustomization, Common OptionsTeam Members and Points of Contact
![Page 3: An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documents Tokenizes documents into](https://reader034.fdocuments.net/reader034/viewer/2022042917/5f5afa18c1d08c261e3fc9dd/html5/thumbnails/3.jpg)
Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documentsTokenizes documents into sentences, phrases, terms, wordsMatches phrases to closest UMLS Concepts Java Implementation of MetaMapIt is Middleware. There is no GUI
![Page 4: An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documents Tokenizes documents into](https://reader034.fdocuments.net/reader034/viewer/2022042917/5f5afa18c1d08c261e3fc9dd/html5/thumbnails/4.jpg)
Information Extraction tasksClassification/Categorization tasksText Summarization/Question & Answer tasksData-mining tasksKnowledge discovery tasksText Understanding tasksUMLS Concept based indexing and retrievalNLP Tasks
![Page 5: An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documents Tokenizes documents into](https://reader034.fdocuments.net/reader034/viewer/2022042917/5f5afa18c1d08c261e3fc9dd/html5/thumbnails/5.jpg)
Medical Problem Extraction from Clinical DocumentsExtraction of Concepts from Chief ComplaintsExtraction of Concepts from Clinical NarrativesExtracting Diagnoses from Discharge SummariesConcept-Value Pair Extraction from Semi-Structured Echocardiogram ReportsEnzyme class AnnotationTerm Identification in the Biomedical Literature Extracting Molecular Binding Relationships from Biomedical Text
![Page 6: An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documents Tokenizes documents into](https://reader034.fdocuments.net/reader034/viewer/2022042917/5f5afa18c1d08c261e3fc9dd/html5/thumbnails/6.jpg)
Retinoblastoma
What is retinoblastoma?Retinoblastoma is a rare type of eye cancer that develops in the retina, which is the part of the eye that detects light and color. Although this disorder can occur at any age, it usually develops in young children.
Extracts UMLS concepts from text
Meta Mapping (1000):C0496836
(Malignant neoplasm of eye, unspecified) [Neoplastic Process]
![Page 7: An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documents Tokenizes documents into](https://reader034.fdocuments.net/reader034/viewer/2022042917/5f5afa18c1d08c261e3fc9dd/html5/thumbnails/7.jpg)
Medical Text-- ---- - --- - -- -- --- -------- -- -- -- -
--- - ------ ------- ----- ----- ---- ---- --- --
--------- -- --- ----- -- --- --- --- ------- ---
--- ------ - ----- ---- --- --- ---- ----- -
--
---
-- --
--
--
--
--
----
--
![Page 8: An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documents Tokenizes documents into](https://reader034.fdocuments.net/reader034/viewer/2022042917/5f5afa18c1d08c261e3fc9dd/html5/thumbnails/8.jpg)
Inverse_isaAdjacent_toConnected_toConstitutes
UvealDiseases
Iris (Eye)
PupilC0034121
Pupil of left eye
Pupil of right eye
Anatomical conduit
Uvea Iris structure
Anatomical space
(2)
![Page 9: An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documents Tokenizes documents into](https://reader034.fdocuments.net/reader034/viewer/2022042917/5f5afa18c1d08c261e3fc9dd/html5/thumbnails/9.jpg)
> mmtx --fileName=retnoblastoma.txt
Phrase: "of eye cancer"Meta Candidates (7)1000 Eye Cancer (Malignant neoplasm of eye, unspecified) [Neoplastic Process]
861 Cancer (Malignant Neoplasms) [Neoplastic Process]861 Cancer (Malignant neoplasm, primary (morphologic abnormality)) [Neoplastic Process]
861 Cancer (Cancer Genus) [Invertebrate]694 Eye [Body Part, Organ, or Organ Component]694 Eye (Entire eye) [Body Part, Organ, or Organ Component]638 Ophthalmic [Spatial Concept]
Meta Mapping (1000)1000 Eye Cancer (Malignant neoplasm of eye,
unspecified) [Neoplastic Process]
![Page 10: An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documents Tokenizes documents into](https://reader034.fdocuments.net/reader034/viewer/2022042917/5f5afa18c1d08c261e3fc9dd/html5/thumbnails/10.jpg)
DocumentSections
Sentences
Phrases
FinalMappings UMLS_
Concepts
Java Container Classes
![Page 11: An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documents Tokenizes documents into](https://reader034.fdocuments.net/reader034/viewer/2022042917/5f5afa18c1d08c261e3fc9dd/html5/thumbnails/11.jpg)
MMTxAPI
Document processDocument ( File pFile)Document processDocument ( String pDocumentText)void processSentence ( Sentence pSentence)Sentence processSentence ( String pSentenceText)Sentence processString ( String pString,
boolean pTrmPrcsng)Phrase processTerm ( String pTerm)
MMTxAPI()MMTxAPI(String[] args)
API Methods and Constructors
![Page 12: An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documents Tokenizes documents into](https://reader034.fdocuments.net/reader034/viewer/2022042917/5f5afa18c1d08c261e3fc9dd/html5/thumbnails/12.jpg)
Tokenization
Noun Phrase Parser
Variant Generation
Candidate Retrieval
Evaluation
Final Mapping
POS Tagger Client
Lexical LookupDocument
Sections
SentencesPhrasesPhrases
FinalMappings
UMLS_Concepts
Medical Text-- ---- - --- - ----- -- -- ---- - ------ ------- ---------- -- ---- ---- ---- ------ ------ - -- ----- --- -
How does it Work?
![Page 13: An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documents Tokenizes documents into](https://reader034.fdocuments.net/reader034/viewer/2022042917/5f5afa18c1d08c261e3fc9dd/html5/thumbnails/13.jpg)
http://mmtx.nlm.nih.govThe download is password restricted
Preconditions:
Signed the UMLS license agreement UMLS Knowledge source server account name and password
Where can I get it?
![Page 14: An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documents Tokenizes documents into](https://reader034.fdocuments.net/reader034/viewer/2022042917/5f5afa18c1d08c261e3fc9dd/html5/thumbnails/14.jpg)
OS s supported:Solaris/Windows NT/2000/XP, Linux, Mac OS/X
Java 1.4 or better400 mb disk space for software3GB disk space for each Year s Data per model 600+ MB or more RAM/Swap space
Minimum Requirements
![Page 15: An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documents Tokenizes documents into](https://reader034.fdocuments.net/reader034/viewer/2022042917/5f5afa18c1d08c261e3fc9dd/html5/thumbnails/15.jpg)
Must be a UMLS Signatory Must customize the data to the vocabularies that you have rights to useThe MMTx software soon to be under an open source agreement:
Attribution, redistribution in-total, no NLM endorsement, indemnity
Terms and Conditions of Use
![Page 16: An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documents Tokenizes documents into](https://reader034.fdocuments.net/reader034/viewer/2022042917/5f5afa18c1d08c261e3fc9dd/html5/thumbnails/16.jpg)
You will need to customize it because Need to extract only those vocabularies you have rights toExtract only those vocabularies that make sense for your application
Can I customize the data?
![Page 17: An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documents Tokenizes documents into](https://reader034.fdocuments.net/reader034/viewer/2022042917/5f5afa18c1d08c261e3fc9dd/html5/thumbnails/17.jpg)
Meta-MorphoSys
MMTxDataFileBuilder
Customized UMLS
Metathesaurusfiles
MMTxIndexes
MMTx
Used by
Produces
Input to
Input to
Produces
UMLS Metathesuarus
Release files
How can I customize the data
![Page 18: An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documents Tokenizes documents into](https://reader034.fdocuments.net/reader034/viewer/2022042917/5f5afa18c1d08c261e3fc9dd/html5/thumbnails/18.jpg)
Limit matches to conservative variationNoun/adjective derivations, unique acronyms and expansions
Filter by semantic type(s)Filter by vocabulary source(s)Match on (longer) composite phrases
How do I tune it?
![Page 19: An Overview - A Tool For Recognizing UMLS Concepts in Text · Tool for discovering controlled medical vocabulary terms (via UMLS Concepts) within documents Tokenizes documents into](https://reader034.fdocuments.net/reader034/viewer/2022042917/5f5afa18c1d08c261e3fc9dd/html5/thumbnails/19.jpg)
Team MembersAlan Aronson,Jim Mork, Guy Divita, Willie Rogers & Cliff Gay
Email: [email protected]: http://mmtx.nlm.nih.gov
Team Members and
Points of Contact