Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi...

46
Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010 1

Transcript of Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi...

Page 1: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Using linked data to interpret tables

Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi

University of Maryland, Baltimore County November 8, 2010

1

Page 2: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Interpreting a table

Name Team Position Height

Michael Jordan Chicago Shooting guard 1.98

Allen Iverson Philadelphia Point guard 1.83

Yao Ming Houston Center 2.29

Tim Duncan San Antonio Power forward 2.11

http://dbpedia.org/class/yago/NationalBasketballAssociationTeams

http://dbpedia.org/class/yago/NationalBasketballAssociationTeams

http://dbpedia.org/resource/Allen_Iversonhttp://dbpedia.org/resource/Allen_Iverson Map numbers as values of properties

Map numbers as values of properties

dbprop:teamdbprop:team

Page 3: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Interpreting a table

Name Team Position Height

Michael Jordan Chicago Shooting guard 1.98

Allen Iverson Philadelphia Point guard 1.83

Yao Ming Houston Center 2.29

Tim Duncan San Antonio Power forward 2.11

@prefix dbpedia: <http://dbpedia.org/resource/> .@prefix dbpedia-owl: <http://dbpedia.org/ontology/> .@prefix yago: <http://dbpedia.org/class/yago/> .

"Name"@en is rdfs:label of dbpedia-owl:BasketballPlayer ."Team"@en is rdfs:label of yago:NationalBasketballAssociationTeams .

"Michael Jordan"@en is rdfs:label of dbpedia:Michael Jordan .dbpedia:Michael Jordan a dbpedia-owl:BasketballPlayer .

"Chicago Bulls"@en is rdfs:label of dbpedia:Chicago Bulls .dbpedia:Chicago Bulls a yago:NationalBasketballAssociationTeams .

@prefix dbpedia: <http://dbpedia.org/resource/> .@prefix dbpedia-owl: <http://dbpedia.org/ontology/> .@prefix yago: <http://dbpedia.org/class/yago/> .

"Name"@en is rdfs:label of dbpedia-owl:BasketballPlayer ."Team"@en is rdfs:label of yago:NationalBasketballAssociationTeams .

"Michael Jordan"@en is rdfs:label of dbpedia:Michael Jordan .dbpedia:Michael Jordan a dbpedia-owl:BasketballPlayer .

"Chicago Bulls"@en is rdfs:label of dbpedia:Chicago Bulls .dbpedia:Chicago Bulls a yago:NationalBasketballAssociationTeams .

Page 4: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Use Cases

Name Team Position Height

Michael Jordan Chicago Shooting guard 1.98

Allen Iverson Philadelphia Point guard 1.83

Yao Ming Houston Center 2.29

Tim Duncan San Antonio Power forward 2.11

Name Team Position Height

Michael Jordan Chicago Shooting guard 1.98

Allen Iverson Philadelphia Point guard 1.83

Yao Ming Houston Center 2.29

Tim Duncan San Antonio Power forward 2.11

Name Team Position Height

Michael Jordan Chicago Shooting guard 1.98

Allen Iverson Philadelphia Point guard 1.83

Yao Ming Houston Center 2.29

Tim Duncan San Antonio Power forward 2.11

Name Team Position Height

Michael Jordan Chicago Shooting guard 1.98

Allen Iverson Philadelphia Point guard 1.83

Yao Ming Houston Center 2.29

Tim Duncan San Antonio Power forward 2.11

Name Team Position Height

Michael Jordan Chicago Shooting guard 1.98

Allen Iverson Philadelphia Point guard 1.83

Yao Ming Houston Center 2.29

Tim Duncan San Antonio Power forward 2.11

Intelligent querying over data

Create a ‘Semantic’ knowledge-base

Page 5: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Use CasesName Team Position Height

Michael Jordan Chicago Shooting guard 1.98

Allen Iverson Philadelphia Point guard 1.83

Yao Ming Houston Center 2.29

Tim Duncan San Antonio Power forward 2.11

@prefix dbpedia: <http://dbpedia.org/resource/> .@prefix dbpedia-owl: <http://dbpedia.org/ontology/> .@prefix yago: <http://dbpedia.org/class/yago/> .

"Name"@en is rdfs:label of dbpedia-owl:BasketballPlayer ."Team"@en is rdfs:label of yago:NationalBasketballAssociationTeams .

"Michael Jordan"@en is rdfs:label of dbpedia:Michael Jordan .dbpedia:Michael Jordan a dbpedia-owl:BasketballPlayer .

"Chicago Bulls"@en is rdfs:label of dbpedia:Chicago Bulls .dbpedia:Chicago Bulls a yago:NationalBasketballAssociationTeams .

@prefix dbpedia: <http://dbpedia.org/resource/> .@prefix dbpedia-owl: <http://dbpedia.org/ontology/> .@prefix yago: <http://dbpedia.org/class/yago/> .

"Name"@en is rdfs:label of dbpedia-owl:BasketballPlayer ."Team"@en is rdfs:label of yago:NationalBasketballAssociationTeams .

"Michael Jordan"@en is rdfs:label of dbpedia:Michael Jordan .dbpedia:Michael Jordan a dbpedia-owl:BasketballPlayer .

"Chicago Bulls"@en is rdfs:label of dbpedia:Chicago Bulls .dbpedia:Chicago Bulls a yago:NationalBasketballAssociationTeams .

Data Integration

Search / Query over tables

Name Team Position Height

Michael Jordan Chicago Shooting guard 1.98

Allen Iverson Philadelphia Point guard 1.83

Yao Ming Houston Center 2.29

Tim Duncan San Antonio Power forward 2.11

Confirm/Verify existing knowledgeAdd new knowledge to the LOD cloud

Convert legacy data into Semantic Web formats

Page 6: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Motivation and Related Work

Page 7: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

We are laying a strong foundation for the Semantic Web …

… but an old problem haunts us …

Page 8: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Chicken ? Egg ? … No Chicken ?

• ~ 14.1 billion tables, 154 million with high quality relational data (Cafarella et al. 2008)

• 305,632 Datasets available as CSV or spreadsheets on Data.gov (US) + 7 Other nations establishing open data

• Where is structured data ?

Page 9: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Automate the process

• We need systems that can generate data from existing sources

• Not practical for humans to encode all this into RDF manually

Page 10: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Related Work

• Database to Ontology mapping (Barrasa, scar Corcho, & Gmez-prez 2004), (Hu & Qu 2007), (Papapanagiotou et al. 2006), and (Lawrence 2004)

• Mapping Relational databases to RDF [W3C working group – RDB2RDF]

Page 11: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Related Work

• Mapping spreadsheets to RDF [RDF123, XLWrap]

• Practical and helpful systems but … – Require significant manual work– Do not generate linked data

• Interpreting web tables to answer complex search queries over the web tables (Limaye et al. 2010)

Page 12: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

T2LD Framework

Predict Class for Columns

Predict Class for Columns

Linking the table cells

Linking the table cells

Identify and Discover relations

Identify and Discover relations

T2LD Framework

Page 13: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

T2LD Framework

Predict Class for Columns

Predict Class for Columns

Linking the table cells

Linking the table cells

Identify and Discover relations

Identify and Discover relations

Page 14: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Predicting Class Labels for column

Team

Chicago

Philadelphia

Houston

San Antonio

Class

Instance

Class for the column

Class 1

Class 2

Class 3

Class 4

Page 15: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Knowledge Base

Yago

Wikitology1 – A hybrid knowledge base where structured data meets unstructured data

1 – Wikitology was created as part of Zareen Syed’s Ph.D. dissertation

Page 16: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Querying the Knowledge–Base

1. Chicago Bulls2. Chicago3. Judy Chicago

1. Chicago Bulls2. Chicago3. Judy Chicago

1. Philadelphia2. Philadelphia 76ers3. Philadelphia (film)

1. Philadelphia2. Philadelphia 76ers3. Philadelphia (film)

1. Houston Rockets2. Houston3. Allan Houston

1. Houston Rockets2. Houston3. Allan Houston

{dbpedia-owl:Place,dbpedia-owl:City,yago:WomenArtist,yago:LivingPeople,yago:NationalBasketballAssociationTeams }

Types

{dbpedia-owl:Place, dbpedia-owl:PopulatedPlace, dbpedia-owl:Film,yago:NationalBasketballAssociationTeams …. ….. ….. }

{……………………………………………………………. }

Team

Chicago

Philadelphia

Houston

San Antonio

Page 17: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Scoring the classesPossible Classes for the column - dbpedia-owl:Placedbpedia-owl:Cityyago:WomenArtistyago:LivingPeopleyago:NationalBasketballAssociationTeamsdbpedia-owl:PopulatedPlacedbpedia-owl:Film………

Possible Classes for the column - dbpedia-owl:Placedbpedia-owl:Cityyago:WomenArtistyago:LivingPeopleyago:NationalBasketballAssociationTeamsdbpedia-owl:PopulatedPlacedbpedia-owl:Film………

[Chicago, dbpedia-owl:City][Philadelphia, dbpedia-owl:City][Houston, dbpedia-owl:City] ….….[Chicago,dbpedia-owl:Film][Philadelphia,dbpedia-owl:Film]………

[Chicago, dbpedia-owl:City][Philadelphia, dbpedia-owl:City][Houston, dbpedia-owl:City] ….….[Chicago,dbpedia-owl:Film][Philadelphia,dbpedia-owl:Film]………

E.g. Processing class – “Chicago,yago:NationalBasketballAssociationTeams”

String Chicago: (R = 1) Chicago Bulls {yago:NationalBasketballAssociationTeams} [PR = 6](R = 2) Chicago {dbpedia-owl:PopulatedPlace, dbpedia-owl:City} [PR = 5](R = 3) Judy Chicago {yago:WomenArtist,yago:LivingPeople} [PR = 4]

Score = w x ( 1 / R ) + (1 – w) x (Normalized Page Rank)[Chicago, yago:NationalBasketballAssociationTeams] = (0.25 x 1 / 1 ) + (0.75 x 6 / 7) = 0.892

E.g. Processing class – “Chicago,yago:NationalBasketballAssociationTeams”

String Chicago: (R = 1) Chicago Bulls {yago:NationalBasketballAssociationTeams} [PR = 6](R = 2) Chicago {dbpedia-owl:PopulatedPlace, dbpedia-owl:City} [PR = 5](R = 3) Judy Chicago {yago:WomenArtist,yago:LivingPeople} [PR = 4]

Score = w x ( 1 / R ) + (1 – w) x (Normalized Page Rank)[Chicago, yago:NationalBasketballAssociationTeams] = (0.25 x 1 / 1 ) + (0.75 x 6 / 7) = 0.892

Page 18: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

T2LD Framework

Predict Class for Columns

Predict Class for Columns

Linking the table cells

Linking the table cells

Identify and Discover relations

Identify and Discover relations

Page 19: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Machine Learning based Approach

Table Cell + Column Header + Row Data

+ Column Type

Table Cell + Column Header + Row Data

+ Column Type

Requery KB with predicted class labels as additional evidence

Requery KB with predicted class labels as additional evidence

Generate a feature vector for the top N results of the query

Generate a feature vector for the top N results of the query

Classifier ranks the entities within the set

of possible results

Classifier ranks the entities within the set

of possible results

Select the highest ranked entity

Select the highest ranked entity

A second classifier decides whether to

link or not

A second classifier decides whether to

link or not

Link to “NIL”Link to “NIL”Link to the top

ranked instanceLink to the top

ranked instance

Page 20: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Learning to Rank

• We trained a SVMrank classifier which learnt to rank entities within a given set

Feature VectorFeature Vector

Similarity MeasuresSimilarity Measures

Popularity MeasuresPopularity Measures

• Levenshtein distance• Dice Score• Levenshtein distance• Dice Score

• Wikitology Score• PageRank• Page Length

• Wikitology Score• PageRank• Page Length

Page 21: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

“To Link or not to Link … ’’

• A second SVM classifier

• Feature vector included the feature vector of the top ranked entity and additional two features –

– The SVMrank score of the top ranked entity– The difference in scores between the top two

ranked entities

Page 22: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

T2LD Framework

Predict Class for Columns

Predict Class for Columns

Linking the table cells

Linking the table cells

Identify and Discover relations

Identify and Discover relations

Page 23: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Identify Relations

Name

Michael Jordan

Allen Iverson

Yao Ming

Tim Duncan

Team

Chicago

Philadelphia

Houston

San Antonio

Rel ‘A’Rel ‘A’

Rel ‘A’

Rel ‘A’, ‘C’

Rel ‘A’, ‘B’, ‘C’

Rel ‘A’, ‘B’

Page 24: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Relation between columns

Michael Jordan - Chicago

Allen Iverson - Philadelphia

Yao Ming - Houston

Michael Jordan - Chicago

Allen Iverson - Philadelphia

Yao Ming - Houston

dbprop:teamdbprop:team

dbprop:teamdbprop:draftTeam

dbprop:teamdbprop:draftTeam

dbprop:teamdbprop:team

dbprop:team dbprop:draftTeam

dbprop:team dbprop:draftTeam

Candidate relationsCandidate relations

Page 25: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Scoring the relations

Michael Jordan - Chicago

Allen Iverson – Philadelphia

Yao Ming - Houston

Michael Jordan - Chicago

Allen Iverson – Philadelphia

Yao Ming - Houston

dbprop:teamdbprop:team

dbprop:team dbprop:draftTeam

dbprop:team dbprop:draftTeam

dbprop:teamdbprop:team

Candidates: dbprop:team

dbprop:draftTeam

Candidates: dbprop:team

dbprop:draftTeam

dbprop:draftTeamScore: 0dbprop:draftTeamScore: 0

dbprop:draftTeam

Score:1

dbprop:draftTeam

Score:1

dbprop:teamScore:3dbprop:teamScore:3

Page 26: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

T2LD Framework

Predict Class for Columns

Predict Class for Columns

Linking the table cells

Linking the table cells

Identify and Discover relations

Identify and Discover relations

Page 27: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Annotating web tables for the Semantic Web

Page 28: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Table as linked RDF

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix dbpedia: <http://dbpedia.org/resource/> .@prefix dbpedia-owl: <http://dbpedia.org/ontology/> .@prefix yago: <http://dbpedia.org/class/yago/> .

"Name"@en is rdfs:label of dbpedia-owl:BasketballPlayer ."Team"@en is rdfs:label of yago:NationalBasketballAssociationTeams .

"Michael Jordan"@en is rdfs:label of dbpedia:Michael Jordan .dbpedia:Michael Jordan a dbpedia-owl:BasketballPlayer .

"Chicago Bulls"@en is rdfs:label of dbpedia:Chicago Bulls .dbpedia:Chicago Bulls a yago:NationalBasketballAssociationTeams .

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix dbpedia: <http://dbpedia.org/resource/> .@prefix dbpedia-owl: <http://dbpedia.org/ontology/> .@prefix yago: <http://dbpedia.org/class/yago/> .

"Name"@en is rdfs:label of dbpedia-owl:BasketballPlayer ."Team"@en is rdfs:label of yago:NationalBasketballAssociationTeams .

"Michael Jordan"@en is rdfs:label of dbpedia:Michael Jordan .dbpedia:Michael Jordan a dbpedia-owl:BasketballPlayer .

"Chicago Bulls"@en is rdfs:label of dbpedia:Chicago Bulls .dbpedia:Chicago Bulls a yago:NationalBasketballAssociationTeams .

“Team”@en is rdfs:label of dbpedia-owl:Team .“Team” is the common / human name for the class dbpedia-owl:Team

“Team”@en is rdfs:label of dbpedia-owl:Team .“Team” is the common / human name for the class dbpedia-owl:Team

dbpedia:Chicago_Bulls a yago:NationalBasketballAssociationTeams .dbpedia:Chicago_Bulls is a type (instance) yago:NationalBasketballAssociationTeams

dbpedia:Chicago_Bulls a yago:NationalBasketballAssociationTeams .dbpedia:Chicago_Bulls is a type (instance) yago:NationalBasketballAssociationTeams

Page 29: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Results

Page 30: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Dataset summary

Number of Tables 15

Total Number of rows 199

Total Number of columns 56 (52)

Total Number of entities 639 (611)

* The number in the brackets indicates # excluding columns that contained numbers

Page 31: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Dataset summary

Page 32: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Dataset summary

Page 33: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Evaluation for class label predictions

Page 34: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Evaluation # 1 (MAP)

• Compared the system’s ranked list of labels against a human ranked list of labels

• Metric - Mean Average Precision (MAP)

• Commonly used in the Information Retrieval domain to compare two ranked sets

Page 35: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Evaluation # 1 (MAP)

80.76 %

System Ranked:1. Person2. Politician3. President

Evaluator Ranked:1. President2. Politician3. OfficeHolder

Page 36: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Evaluation # 2 (Recall)

Recall > 0.6 (75 %)

System Ranked:1. Person2. Politician3. President

Evaluator Ranked:1. President2. Politician3. OfficeHolder

Page 37: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Evaluation # 3 (Correctness)

• Evaluated whether our predicted class labels were “fair and correct”

• Class label may not be the most accurate one, but may be correct. – E.g. dbpedia-owl:PopulatedPlace is not the most accurate, but still

a correct label for column of cities

• Three human judges evaluated our predicted class labels

Page 38: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Evaluation # 3 (Correctness)

• A category-wise breakdown for class label correctnessOverall

Accuracy: 76.92 %

Column – NationalityPrediction – MilitaryConflict

Column – Birth PlacePrediction – PopulatedPlace

Page 39: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Evaluation for linking table cells to entities

Page 40: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Category-wise accuracy for linking table cells

Overall Accuracy: 66.12 %

Page 41: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Relation between columns

• Idea – Ask human evaluators to identify relations between columns in a given table

• Pilot Experiment – Asked three evaluators to annotate five random tables from our dataset

• Evaluators identified 20 relations

• Our accuracy – 5 out of 20 (25 % ) were correct

Page 42: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Conclusion and Future Work

Page 43: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

Conclusion

• We have demonstrated that it is possible to develop a automated framework for converting tables & spreadsheets to linked data

• Extending and adapting this framework for Open government data

• Discovery of new relations between entities

Page 44: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

References• Cafarella, M. J., Halevy, A., Wang, D. Z., Wu, E., Zhang, Y., 2008.

Webtables:exploring the power of tables on the web. Proc. VLDB Endow.1 (1), 538-549.

• Barrasa, J., Corcho, O., Gomez-perez, A., 2004. R2o, an extensible and semantically based database-to-ontology mapping language. In Proceedings of the 2nd Workshop on Semantic Web and Databases(SWDB2004). Vol. 3372. pp. 1069-1070.

• Hu, W., and Qu, Y. 2007. Discovering simple mappings between relational database schemas and ontologies. In Aberer, K.; Choi, K.-S.; Noy, N. F.; Allemang, D.; Lee, K.-I.; Nixon, L. J. B.; Golbeck, J.; Mika, P.; Maynard, D.; Mizoguchi, R.; Schreiber, G.;and Cudre-Mauroux, P., eds., ISWC/ASWC, volume 4825 of Lecture Notes in Computer Science, 225238. Springer.

• Papapanagiotou, P.; Katsiouli, P.; Tsetsos, V.; Anagnostopoulos, C.; and Hadjiefthymiades, S. 2006. Ronto: Relational to ontology schema matching. In AISSIGSEMIS BULLETIN.

Page 45: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

• Lawrence, E. D. R. 2004. Composing mappings between schemas using a reference ontology. In In Proceedings of International Conference on Ontologies, Databases and Application of Semantics (ODBASE), 783800. Springer

• Han, L.; Finin, T.; Parr, C.; Sachs, J.; and Joshi, A. 2008. RDF123: from Spreadsheets to RDF. In Seventh International Semantic Web Conference. Springer.

• Han, L., Finin, T., Yesha, Y., 2009. Finding semantic web ontology terms from words. In: Proceedings of the Eight International Semantic Web Conference. Springer.

• Limaye, G., Sarawagi, S., Chakrabarti, S.: Annotating and searching web tables using entities, types and relationships. In: Proc. of the 36th Int'l Conference on Very Large Databases (VLDB). (2010)

References

Page 46: Using linked data to interpret tables Varish Mulwad, Tim Finin, Zareen Syed and Anupam Joshi University of Maryland, Baltimore County November 8, 2010.

This work was supported by: