Post on 24-Jan-2015
description
Freebase SchemaJamie Taylor
Wednesday, December 8, 2010
Goals
• Schema: The Freebase Data Model
• Schema as API
• Schema patterns
Wednesday, December 8, 2010
Freebase is a collection of factsSofia Coppola directed Marie Antoinette{ { {Freebase only contains
nodes and LinksWednesday, December 8, 2010
Freebase is a Graph
Wednesday, December 8, 2010
Freebase is a labeled Graph
directed
parent sibling
child wrote
directed
directed
starred_in
starred_in
Wednesday, December 8, 2010
Schema
"All the things you can say about something in Freebase"
Schema is the data model for Freebase
Wednesday, December 8, 2010
All nodes are “/type/object”
name
“Francis Coppola”
type/people/person
id/m/02vyw
[{ "id":"/m/02vyw", "name":null, "type":[{}]}]
/type/object/name
type /film/director
Wednesday, December 8, 2010
/en/bram_stokers_dracula
Types suggest properties to use
id/m/02vyw
type /film/director
/film/d
irecto
r/film
/type/object/idWednesday, December 8, 2010
Queries follow schema
[{ "id": "/en/francis_ford_coppola", "/film/director/film": [{ "id":null, "name":null }]}]
Wednesday, December 8, 2010
Properties link the graph together
id/m/02vyw
type /film/director
/en/bram_stokers_dracula/type/object/id/fil
m/dire
ctor/fi
lm
written_by
Wednesday, December 8, 2010
Queries follow schema
[{ "id": "/en/francis_ford_coppola", "/film/director/film": [{ "id": "/en/bram_stokers_dracula", "written_by":null }]}]
Name is returned(how to get ID?)
How to get all the writters for all of Coppola’s movies?
Wednesday, December 8, 2010
Core Concepts
Wednesday, December 8, 2010
Core Concepts
Instance:• Topic: "a thing in the world"
• Blade Runner, Ridley Scott, NBC, Last Proof
Schema:• Types - Categorical collections of instances
• Properties - Relationships between instances
Wednesday, December 8, 2010
Core Concepts
An instance may have multiple Types• "Co-Types" (Types are mix-ins)
• Arnold Schwartzeneger
• Person, Actor, Politician, Sports Figure
Wednesday, December 8, 2010
Lessons from everyday vocabulary
Wikipedia Word Frequency
0
2000000
4000000
6000000
8000000
10000000
12000000
14000000
16000000
18000000
20000000
0 20 40 60 80 100 120
Rank
Freq
uen
cy
Data from Victor S. Grishchenko
Wednesday, December 8, 2010
Schema Principle #1
Use Co-Types Liberally:
Use a few large, encompassing Types to provide general information
Use several smaller, fine grained Types to provide detailed information
Event Example:-Film Festival-Battle of Waterloo
Wednesday, December 8, 2010
Core Concepts
Properties are defined on Types• Properties are the vocabulary for a specific Type
• An instance must be “an instance of a type” before it can use the Type’s properties to describe itself
Relational DBvsRDF
Wednesday, December 8, 2010
Core Concepts
• A Property Value has a specific Type• "Expected Type"
• A Property has exactly one Expected Type
Manufactures
Expected Type ~ RDFS Range
Wednesday, December 8, 2010
Core Concepts
Expected Types (Property Values):• Value Types (literals)
• String (two flavors), Integer, Float, DateTime, boolean
• Object Types
• Everything Else
Wednesday, December 8, 2010
/type/object
Everything in Freebase has this Type
Provides basic properties
• Type
• Name
• .......
All other Properties come from some other Type!
contrast to common topic
Wednesday, December 8, 2010
/common/topic
"Topics"• Things we have discourse about
• Provides properties:
• Alias
• Article
• Image
• Weblinks
• Assumed to be an "Included Type" for any "standard" type
Wednesday, December 8, 2010
Schema Patterns
Compound Value
Mediator
Phylogeny
Enumeration
Wednesday, December 8, 2010
Compound Value
Two or more properties which can only be interpreted with regard to one another
Population
• Dated Integer ("when did this location have that many people")
Movie Budget
• Dated money value
• Date, Currency, Amount
Ticker Symbol
• Exchange, Symbol
complex literal
Wednesday, December 8, 2010
Compound Value
{ "id": "/en/apocalypse_now", "type": "/film/film", "estimated_budget": [{ "currency": null, "amount": null, "valid_date": null }]}
estimated_budget
currency
valid_date
amount 31MM
1979
Wednesday, December 8, 2010
MediatorAn annotation on the link between two Topics• Requires an object between the two Topics
• The Topics become separated by two properties
actor performance film
character
• Also useful for indicating the dates when a relationship existed (e.g., education, employment, etc.)
combine date annotation and character = tv character
Wednesday, December 8, 2010
Mediator
{ "id": "/en/marie_antoinette_2006", "type": "/film/film", "starring": [{ "actor":null, "character":null }]}
Wednesday, December 8, 2010
Phylogeny
Examples:
• /location/location/containedby
• /computer/computer/parent_model
• /tv/tv_program/spin_offs
Used when instances form a hierarchy
Phylogeny properties have an expected Type which is the same as the Type on which the property is defined.
Wednesday, December 8, 2010
Phylogeny
Why can I use the short name??
{ "id": "/en/fairfax_california", "/location/location/containedby": [{ "id": null, "containedby": [{ "id": null }] }]}
Wednesday, December 8, 2010
Enumerated Value
Closed collection of “values” for a property
Constrains relations to fixed set of objects
• /people/person/gender
{ female, male, other }
• /visual_art/visual_artist/art_forms
{ drawing, painting, print making, photography.... }
Wednesday, December 8, 2010
Explore the Freebase Graph
directed
parent sibling
child wrote
directed
directed
starred_in
starred_in
Wednesday, December 8, 2010
Explore the Freebase Graph
[{ "id": null, "type": "/film/director"}]
Wednesday, December 8, 2010
Explore the Freebase Graph
[{ "id": null, "type": "/film/director", "/people/person/children": [{ "id": null, "type": "/film/director" }]}]
Wednesday, December 8, 2010
Explore the Freebase Graph
[{ "id": null, "type": "/film/director", "film":[ ], "/people/person/children": [{ "id": null, "type": "/film/director" "film":[ ] }]}]
Wednesday, December 8, 2010
Explore the Freebase Graph[{ "id": null, "type": "/film/director", "film": [ ], "/people/person/children": [{ "id": null, "type": "/film/director", "film": [{ "name":null, "starring": [{ "actor": null }] }] }]}]
Wednesday, December 8, 2010
acto
rfilm
"Harrison Ford"
sta
rring
film
film
actor
person
"Blade Runner"
name
name
performance
date_of_birth
1942-07-13
film character
"Rick Deckard"
name
type
type
"film"
name
insta
nce
type
instance
type
type
"actor"
name
type
"person"
name
type
insta
nce
type
insta
nce
type
insta
nce
type
inst
ance
properties
property
type
"date of birth"
name
expected_typedate_time
type
instance
/
/film
film (key)
/people
people (key)
type
type
instance
name
"domain"
type
instance
type
instance
pers
on (
key)
film (ke
y)
LEGEND
/type/object + /common/topic
/type/object
outgoing incoming
keyvalue (key)
outgoing property
literal value
/namespace
obj type
domain
namespace
domain
"type"
name type
instance
type
type
instance
"property"name
property
type
instance
"expected type"
name
expected_type
insta
nce
properties
It’s all nodes and links!
Wednesday, December 8, 2010
"commons" individual's "bases"
"domains"
BladeRunner
promote
Domains, Bases and Commons
Wednesday, December 8, 2010
Questions?!
Docs: www.freebase.com/docs
Wiki: wiki.freebase.com
Mailing List: lists.freebase.com
Wednesday, December 8, 2010