Midterm Progress Report Stanley Roberts July 17, 2009.

13
Midterm Progress Report Stanley Roberts July 17, 2009

Transcript of Midterm Progress Report Stanley Roberts July 17, 2009.

Midterm Progress ReportStanley RobertsJuly 17, 2009

TimeML Time Tagging

Attempt to identify time references in text.

Interpreted the identified cases Convert information into standard

template Extract time, date, duration Tag using TimeML standard

Segmenting Sentences

I will see you tomorrow morning.

I

will

see

you

tomorrow

morning.

•All information is preserved•Words separated by spaces•Segmentation is indiscriminate and blind to special cases

Chunking Compound Words

The site dates to the stone age.

The

site

dates

to

the

stone

age.

The

site

dates

to

the

stone age.

Tagging Words

IN – Prepositions MONEY NUM – pure numbers POINT – this, that, next, last POSTPROP – Postpositions “ago” QTY – Quantity “many”, “few” RELATIVE – “later” TIME – “year”, “month” TIMEPROP – proper names, “Wednesday”

Tagging Words (page 2)on Saturday January 29,

1955 we went to the park.

On IN

Saturday

TP

January TP

29 NUM

1955 NUM

we

went

to IN

the

park.

I will be here tomorrow.

I

will

be

here

tomorrow.

TIME

Chunking Time Related Phrases Combinations of tagged words are

matched to predefined templates The templates attempt to find

relevant results and filter noise.

Chunking Time Related Phraseson Saturday January 29,

1955 we went to the park.

On IN

Saturday

TP

January TP

29 NUM

1955 NUM

we

went

to IN

the

park.

INTTNN – matched to template

IN – not matched, reference ignored

Value Extractionon Saturday January 29,

1955 we went to the park.

On IN

Saturday

TP

January TP

29 NUM

1955 NUM

we

went

to IN

the

park.

INTTNN – matched to template• Converts extracted data to standard format defined by TimeML Annotation Guidelines.• value=“YYYY-MM-DD”•value=“1955-01-29”

Value Extraction – Smart TagTomorrow we will go to the

park.

Tomorrow

TIME

we

will

go

to IN

the

park.

T – matched to template• Using contextual date from document•From last slide value=“1955-01-29”• value=“1955-01-30” ->uses context•Attempts to update contextual data with most recent information

Value Extraction – Smart TagMonday we will go to the

park.

Monday TIMEPROP

we

will

go

to IN

the

park.

TP – matched to template• Using contextual date from document•From last slide value=“1955-01-29”

•Saturday• value=“1955-01-31” ->uses context•Attempts to update contextual data with most recent information

Value Extraction – Smart TagMonday we will go to the

park.

Monday TIMEPROP

10:30 TIMEPROP

we

will

go

to IN

the

park.

TPTP – matched to template• Using contextual date from document•From last slide value=“1955-01-29”

•Saturday• value=“1955-01-31T10:30” ->uses context•Attempts to update contextual data with most recent information

Type Extraction – TimeX3 std. TimeX3 specifies time phrases should

be tagged with one of three types Date - value=“1955-1-29” Time - value=“1955-1-29T24:00” Time - value=“T24:00” Duration – “4 months” -> value=“P4M” Duration – “20 minutes” ->

value=“PT20M”