System for-health-diagnosis

15
Information Retrieval and Extraction (CSE474) MAJOR PROJECT in International Institute of Information Technology, Hyderabad Anand Kalburgi (201305648) Arnav Singh (201305626) Saumya Pathak (201101141) Vennela Miryala (201125034) Presented to : Mr. Vasudeva Varma Mentor : Sandeep Sharma By:

description

 

Transcript of System for-health-diagnosis

Page 1: System for-health-diagnosis

Information Retrieval and Extraction (CSE474)

MAJOR PROJECT in

International Institute of Information Technology, Hyderabad

Anand Kalburgi (201305648) Arnav Singh (201305626)

Saumya Pathak (201101141) Vennela Miryala (201125034)

Presented to : Mr. Vasudeva Varma Mentor : Sandeep Sharma

By:

Page 2: System for-health-diagnosis

Introduction

Problem Statement : ➢ Given some initial symptoms detect the possible disease efficiently.

Resources: ➢ Webmd and Mayoclinic datasets.

Page 3: System for-health-diagnosis

Related Works

➢ WebMD’s symptomChecker online tool

Many similar online healthcare systems like

➢ http://androctor.com

➢ www.mayoclinic.org

Page 4: System for-health-diagnosis

System Components

➢ A simple User Interface

➢ A retrieval system that takes symptom as input and

yields possible disease conditions as output.

➢ An extensive INDEX (both forward and inverted) of

diseases vs symptoms.

Page 5: System for-health-diagnosis

Architecture

fig. Case Diagram for the execution of the diagnostic system

Page 6: System for-health-diagnosis

Challenges

➢ Different data formats of both websites.

➢ Merging the indices.

➢ Stopwords, unwanted weeds and characters.

➢ Recursive AI feature.

Page 7: System for-health-diagnosis

Tools and technologies used

➢ Eclipse as an IDE

➢ PHP and Java for crawling

➢ Jsoup library for getting html pages from links

➢ MetaMap

➢ Python

Page 8: System for-health-diagnosis

Approach in phases

Crawling phase:➢ Using Jsoup library ,php and python➢ Website ----------> textfiles (1 per disease)➢ Each disease file containing crawled text related to that

disease.

Page 9: System for-health-diagnosis

Metamap phase

➢ The Part-of-Speech Tagger server; Word Sense Disambiguation (WSD) Server and the metamap server

➢ Java API. ➢ Phrase-wise formatted output➢ Parsing according to attribute

“Semantic type = [sosy]” ➢ symptoms.

Page 10: System for-health-diagnosis

Index creation phase

➢ Disease vs Symptom fwd index.

➢ Symptom vs Disease inverted index.

➢ Using python.

➢ Merging Both indices and

➢ keeping OFFSET as the mapping attribute.

Page 11: System for-health-diagnosis

The recursive “AI” system phase

➢ 1st Symptom taken as input from user is mapped to its corresponding diseases in the indices.

➢ Now one by one, the symptoms of these diseases were displayed to the user and user is asked to choose his/her symptom from this list.

➢ The user inputs yes/no for each symptom being asked. ➢ This process being a recursive one, the target list of diseases gradually

becomes specific towards the users condition and the List of diseases are output when the target list crosses the minimum threshold.

Page 12: System for-health-diagnosis

UI integration phase

➢ This whole system is now integrated as a web application using a simple online GUI template.

Page 13: System for-health-diagnosis

Conclusion

➢ We have built a system consisting of a knowledge base and a knowledge gathering system to extract relationships between diseases and determinants, symptoms, and the affected body parts.

➢ The end product is a web application with a user friendly interface wherein the user will enter the symptoms he/she has and on giving the data to the system, the system will output the most significant disease/s. This project implements AI feature successfully with more user interactivity.

Page 14: System for-health-diagnosis

References

Crawling Domains,➢ http://www.webmd.com/a-to-z-guides/health-topics/default.htm➢ http://www.mayoclinic.org/diseases-conditions/➢ Min Ye.Text Mining for Building a Biomedical Knowledge Base on

Diseases, Risk Factors, and Symptoms. Master's Thesis, Center for Bioinformatics, Saarland University, March 2011.

➢ Jsoup library for getting html pages from links.➢ Metamap software , UMLS: Unified Medical Language System. http:

//www.nlm.nih.gov/research/umls

Page 15: System for-health-diagnosis