Knowledge Diversity in Media Content Analysis An Analytico ...
Transcript of Knowledge Diversity in Media Content Analysis An Analytico ...
Devika P. Madalli and A.R.D.Prasad
DRTC, Indian Statistical Institute Bangalore, INDIA
Knowledge Diversity in Media
Content Analysis – An
Analytico-synthetic Approach
UDCC Classification and Ontologies Seminar, The Hague, Sept 2011
Premise
• Knowledge is increasingly characterized by diversity that
results in complexities
• Socio-cultural background and purposes attribute to
knowledge diversity.
• Analytico-synthetic approach for topical diversity
• Facet analysis (FA) and synthesis for use in annotations for
Media Content Analysis within the EC funded 'Living
Knowledge' Project.
Introduction
As an asset, diversity of knowledge could lead to
projections of semantic correlations, bias and
opinion mining.
Background – LivingKnowledge Project
• Living Knowledge’ (LK) [EU FET project n0 231126]
considers diversity as an asset and aims to make it
traceable, understandable and exploitable, with the goal of
improving navigation and searching in very large datasets
(Maltese, etal, 2009).
Aims of the project
•study the effects of diversity and time on opinions and
bias in socio-economic relevance, especially for
seamless representation and exchange of information.
•Intuitive search and navigation tools (e.g. search
engines) need produce more insightful, better
organized, aggregated and easier-to-understand output.
Scope of LivingKnowledge Project
•LK employs interdisciplinary competence from philosophy
of science, cognitive science, library science and semiotics.
•The proposed solution is based on the foundational notions
of context and its ability to localize meaning, and the notion
of facet, as from library science, and its ability to organize
knowledge as a set of interoperable components (i.e.facets).
•Diversity of knowledge can be attributed to socio-cultural
interactions and dynamics that casts different aspects of
same or similar concepts in the same and similar domains.
Facet Analysis and Faceted Classification
•‘Facet’ has been used in different connotations.
• Facet is synonymous to category, attribute, class,
group, concept etc (La Barre, 2006).
Faceted Analysis and Classification (Contd.)
•Facet is a distinct division of domain
•Facet is the set of classes got by applying
characteristics successively
•Many facets together make a domain (different
domains have different combinations of facets)
Facet Analysis
• FA is an intellectual process leading to analysis of
a subject into its facets according to postulates and
principles (S.R.Ranganathan)
• It results in sorting of terms in a given domain
into homogeneous, mutually exclusive facets, each
derived from the parent universe (CIU) by a single
characteristic of division at each level in a
hierarchy.
Media Content Analysis (MCA)
•SORA (Institute for Social Research and
Consulting Ogris & Hofinger GmbH) carries out
Analysis of Media Content.
•MCA carried out in 2 phases:
•Phase 1: Facts are entered in a ‘codebook’ that
formalises the representation of the media resources
•Phase 2: Media analysis is based on annotations and
coding resources in order to answer some typical
questions.
Example of Facetization in MCA • To reconstruct the mediated discourse we have to
describe the relevant actors, patterns of interpretation,
differences between actors, topics and different degrees of
diversity. Example: ‘European elections and integration.’
•The basic research questions are:
•What (sub-)topics occur to what extent in the mediated
discourse on ‘integration’?
•What actors in what roles are present in the mediated
discourse on ‘integration’?
•What patterns of interpretation occur in the mediated
discourse on ‘integration’?
Example of Facetization in MCA •Integration refers to the two chosen sub-topics (labour
market and religion) and raised these questions:
•What are the main [topics, actors and countries, arguments,
frames] related to integration?
•Which of these [topics, actors and countries, arguments,
frames] are the most [controversial, accepted, subjective,
biased, etc.].
•What are the main [politicians, parties, organisations etc.]
discussing integration in a [negative, positive, neutral] context?
•Which [politicians, parties, organisations etc.] have changed
their discourse on integration (i.e. from positive to negative)?
•What time periods are most important for integration, and
what other events are correlated to these periods?
•How developed is the discursive character of statements from
different [politicians, parties, organisations etc.]?
Analytico-synthetic Approach -Topical Annotations of
Media Content
•Library based faceted knowledge organization system build
representations by division of domains into distinct facets
based on principles of shared unambiguous characteristics
between the member concepts (Bhattacharya, 1979).
•The same approach has been adopted for MCA, identifying
actors indicators - 'Who' -- 'What' and 'To Whom'-- 'What'.
•‘Content’ in media is analysed following the faceted
approach for topical representation. ‘Topic' is used to inter-
relate the other indicators in a subject based representation.
Topic
Facets [D]iscipline [E]ntity [P]roperty [A]ction [M]odifier
Can be represented y using
Common Isolates Space Time Persons Form …etc
With Who
To Whom What Channel
Can be represented using
Actor Roles Affiliation Scope Form
Topic Claims Frames Opinion
Addressee Object Actor
Mass Media Forum Blog
Who What To Whom What Channel
Indicators (Facets)
Faceted representation of codebook indicators
Multidimensional Faceted Representation for Diversity
• Faceted representation for Political Science
through POPSI technique is represented.
Topic
Nationalism as an issue in forthcoming European Elections 2012
M
Space Modifiers Time Modifiers
Asia 20th Century
Australia 21st Century
Europe 2012
D
Economics
Law
Political Science
Sociology
E
Socialism
Communism
Fascism
Nationalism
A
Nomination
Administration
Representation
Election
Examples
• Example : Nationalism as an issue in forthcoming European Elections
2012”.
Analyzing into facets: the focus is on European Elections,
Nationalism, location and time.
Concept Identification: Political Science [Discipline], Nationalism
[Entity], Election [Action], Forthcoming [Modifier for Action],
European [Space], 2009 [Time]
Coordination
• Strings from Political Science have the subject
index alphabetic file:
•Political Science [Discipline], Election [Action],
Free [Modifier for Action], Egypt [Space]
•Political Science [Discipline], Election [Action],
Europe, Georgia [Space]
Application in MCA • In manual annotations annotators pick a few key terms to assign to the resources being annotated.
•A faceted annotator is equipped with subject strings that provide context to the content of media resource.
•The advantages are:
•annotator can pick the relevant context from a list of suggestions in the form of fully qualified modulated strings of descriptors.
•Such a process of subject index is based on principles where context is intrinsically provided.
- For example – if the query is ‘Elections in Egypt’ then retrieved set certainly has the strings with ‘Elections’ but since Egypt is space element the concept behind is disambiguated using ‘Egypt’ as context.
EU Corpus and FA
• For the purpose of the LK project web resources
in domain 'EU Elections' serves as corpus
maintained by EA.
• Media content is analysed into facets and subject
strings are derived in a principled approach.
Example
•
POPSI categories and entries are
made into a classaurus:
Annotations presented for codebook application.
Advantages – Domain-specific Annotations
• Domain-specific annotations can be deployed to
provide subject or topical approach to content.
• The context is driven by topical concepts at broader or
narrower levels to a given term or by space or temporal
modifiers.
• In LK project facetization enables the MCA to retrieve
media discourse by certain concepts that are represented
by facets and by intrinsic relations between such facets.
• In the SORA model keywords were assigned by
annotators, but in faceted model a faceted domain based
ontology as shown is deployed to provide the semantic
context.
23
09/11/2001
Today, our fellow citizens, our way of life, our
very freedom came under attack in a series of
deliberate and deadly terrorist acts. Immediately
following the first attack, I implemented our
government's emergency response plans. Our
military is powerful, and it's prepared. Our
emergency teams are working in New York City
and Washington, D.C., to help with local rescue
efforts – 9/11/2001
09/11/2011
24
"These past 10 years underscore the bonds
between all Americans. We have not
succumbed to suspicion and mistrust. After
9/11, President (George) Bush made clear
what we reaffirm today: the United States
will never wage war against Islam or any
religion. Immigrants come here from all
parts of the globe,"
The progeny of Muslim asylum seekers mar a
moment of silence for 9/11 victims by burning
an American flag at a London memorial
The time and topical change/evolution
25
09/11/2001
Invasion of Iraq- 2003
Afghanistan War - 2002
Mumbai Attack- 2008
Operation New Dawn - 2010
Ground Zero- 2011
The faceted approach
• The facets are
26
Topic Time Space
09/11/2001 2001, 2011 U.S, India, Afghanistan
Iraq ….
•The diversity in context can be expressed by the combinations of the
above facets for particular instances
Conclusion • Analytico synthetic approach basically facilitates
identifying the distinct facets of a domain and synthesizing
the facets to represent distinct contexts.
• The method helps to deal with diversity, as background
knowledge is modelled in a principled approach that leads
to a flexible (meccano) arrangement of divisions of
domains that can combined and reused as per context.
THANK YOU! [email protected]