Text Pattern Formation For Information Extraction
-
Upload
lidia-pivovarova -
Category
Documents
-
view
452 -
download
0
Transcript of Text Pattern Formation For Information Extraction
Lidia M. PivovarovaSaint-Petersburg State University
The Ph.D. advisor: prof. Valery Sh. Rubashkin
NLDB 2008
FACTORS -
- the system designed to monitor underling characteristics of a
subject domain
General System DescriptionThe
Ontology
TEXTS
Lemmatization, part-of-
speech tagging, semantic mark-up
Morph. analyzer
Semantic analyzer Situati
on State
Search Patterns
The FactorsFactors – the required information aspects.~ 100 factors
Factors: - qualitativee.g. social tension, investment attractiveness,level of sovereignty, human rights activity
- quantitativee.g. the number of unemployed, an average salary, the inflation level, the ammount of import
Numerical valuesQualitative factors:
very small, small, less than average, average, more than average, large, very large.
Quantitative factors: the number + <unit> e. g. an average salary –> monetary unit (ruble, $, …) the number of unemployed -> no units
The PatternsQualitative factors ->“factor + numerical value”
patterns.e. g. Social tension <-- spontaneous meeting (large)
Quantitative factors -> “only factor” patterns.e. g. The number of unemployed <-- become
unemployedSearch algorithm 1) find a pattern 2) find a number + unitif not 3) find words large, small, increase, decrease etc.
Pattern Formation ProcessPattern is a set of words and ontology concepts.
Ontology provides:- pattern generalization- synonym accumulation- information about units
Pattern formation: user marks relevant fragment in a text or chooses concept from the ontology.
ExampleAs is known, European Union strictly
demanded Latvia to close the both generating units of Ignalinskaya nuclear power station. It is also promised to remit 3 billions euro for this goal.
Factors:The EU pressure to Latvia.The financial aid of EU to Latvia.