Post on 22-Dec-2015
August 29, 2001 Melanie Martin - AI Seminar 1
AI Seminar
Our web page is at:
www.cs.nmsu.edu/~gradrep
Under “Events” in left frame
August 29, 2001 Melanie Martin - AI Seminar 2
Identifying Ideological Point of View
Melanie Martin
August 29, 2001
August 29, 2001 Melanie Martin - AI Seminar 3
Outline of this presentation What is AI??? Introduction and Motivation The proposed system Ideology Discourse Statistical NLP and Machine Learning Internet Conclusion
August 29, 2001 Melanie Martin - AI Seminar 4
What is AI???
“The practice of designing systems that possess and acquire knowledge and reason with knowledge.” (Tanimoto 1987)
“The design and study of computer programs that behave intelligently.” (Dean, Allen, Aloimonos 1995)
“The branch of computer science concerned with making computers behave like humans.” (Webopedia)
August 29, 2001 Melanie Martin - AI Seminar 5
What is AI???
But then, what is intelligence???– “the capacity for learning, reasoning,
understanding, and similar forms of mental activity; aptitude in grasping truths, relationships, facts, meanings, etc.” (Webster’s Encyclopedic Unabridged Dictionary of the English Language 1996)
August 29, 2001 Melanie Martin - AI Seminar 6
What is AI???
Agents Data Mining Expert Systems Games and Search
Knowledge Representation
Machine Learning Theory, Case-Based, Rule
Learning, ...
Natural Language Processing Planning
Robotics
Speech Theorem Proving
Vision & Pattern Recognition
Categories under AI on Cora
http://cora.whizbang.com/
August 29, 2001 Melanie Martin - AI Seminar 7
What is AI???
Goals in AI– Engineering: Solve real-world problems.
Build systems that exhibit intelligent behavior.
– Scientific: Understand what kind of computational mechanisms and knowledge are needed for modeling intelligent behavior.
August 29, 2001 Melanie Martin - AI Seminar 8
What is AI??? Do we really want to model humans?
– Seem like our best example, but….– Should we build airplanes with wings that
flap like birds? How do we know we did it?
– Turing test?• Focus on behavior instead of internal algorithm• Defines success in terms of human intelligence• Not well founded
August 29, 2001 Melanie Martin - AI Seminar 9
What is AI???
A couple of recurring issues:– How important is cognitive modeling in our
systems?– How do we balance scientific and
engineering goals?– How do we evaluate our system?
August 29, 2001 Melanie Martin - AI Seminar 10
What is AI???
So let’s get to the system we want to talk about today…..
This system will be in the area of Natural Language Processing aka Computational Linguistics
August 29, 2001 Melanie Martin - AI Seminar 11
Outline of this presentation What is AI??? Introduction and Motivation The proposed system Ideology Discourse Statistical NLP and Machine Learning Internet Conclusion
August 29, 2001 Melanie Martin - AI Seminar 12
Introduction and Motivation
Your back hurts, so you go to the web to find out what you can do, but there is too much information!
You are still bothered by the Florida election results and want to read a few sample articles with differing points of view. How can you find them?
August 29, 2001 Melanie Martin - AI Seminar 13
Introduction and Motivation
Suppose we could take information from web pages and Usenet newsgroups on a given topic and segment, classify or cluster it by ideological point of view…..
This talk is about what it might take to develop such a system.
August 29, 2001 Melanie Martin - AI Seminar 14
Introduction and Motivation
Sounds like a cool toy, but would it make any research contribution?
Areas where it could contribute:– natural language understanding– information retrieval– information extraction– internet structure
August 29, 2001 Melanie Martin - AI Seminar 15
Introduction and Motivation
But will it save the world?
Maybe not, but there is social value in analyzing ideological point of view– find implicit ideological content– better informed, more rational discussion of
important issues
August 29, 2001 Melanie Martin - AI Seminar 16
Outline of this presentation What is AI??? Introduction and Motivation The proposed system Ideology Discourse Statistical NLP and Machine Learning Internet Conclusion
August 29, 2001 Melanie Martin - AI Seminar 17
The Proposed System
Let’s recall what we want to do:
Build a system that could take information from web pages and Usenet newsgroups on a given topic and segment, classify or cluster it by ideological point of view…..
August 29, 2001 Melanie Martin - AI Seminar 18
The Proposed System
IdeologicalClassifier
TopicClassifier,
Filter
Set of documents
on topic
Internet:Web pages,
Usenet
Docs ontopic
classified by IPV
SearchEngine
User inputstopic
August 29, 2001 Melanie Martin - AI Seminar 19
The Proposed System
Immediately some issues arise:– Can we come up with a definition of
ideological point of view that is computationally feasible?
– To what extent do we need to understand the text?
– Would modeling human text understanding help?
August 29, 2001 Melanie Martin - AI Seminar 20
The Proposed System
More issues:– Can the structure of the internet help us?– What kind of knowledge is needed and can
it be learned?– How are we going to evaluate our system?
August 29, 2001 Melanie Martin - AI Seminar 21
Outline of this presentation What is AI??? Introduction and Motivation The proposed system Ideology Discourse Statistical NLP and Machine Learning Internet Conclusion
August 29, 2001 Melanie Martin - AI Seminar 22
Ideology
Working definition from van Dijk: “Ideologies are the fundamental beliefs of a group and its members.”– No negative evaluation– Subjective, since beliefs are subjective– Discourse plays a key role in development
and promulgation of ideologies
August 29, 2001 Melanie Martin - AI Seminar 23
Ideology What do we mean by groups?
– More than one person– Fewer than the entire society or culture– Some level of permanency or common
goals– Some membership criteria– Member identification with the group– Basis for self-definition and commonality– Structure, possibly informal
August 29, 2001 Melanie Martin - AI Seminar 24
Ideology
General strategy of most ideological discourse (van Dijk’s Ideological Square):
– Emphasize positive things about Us– Emphasize negative things about Them– De-emphasize negative things about Us– De-emphasize positive things about Them
Polarization; Us versus Them
August 29, 2001 Melanie Martin - AI Seminar 25
Ideology
How are these strategies instantiated in discourse?– What is there:
• argument structure• syntactic patterns• style and non-literal language• actor descriptions• thematic structure• topoi
August 29, 2001 Melanie Martin - AI Seminar 26
Ideology
– What is not there• implication• presupposition• inference• goals and plans
August 29, 2001 Melanie Martin - AI Seminar 27
Ideology
Disclaimers, selected examples:– Apparent Negation: I have nothing against X, but...– Apparent Concession: They may be very smart,
but...– Apparent Empathy: They may have had problems,
but...– Apparent Effort: We do everything we can, but...
Positive self-representation and face keeping
August 29, 2001 Melanie Martin - AI Seminar 28
Ideology
Linguistics– van Dijk (1998)– Blommaert & Verschueren (1998)– Wang (1993)– Wortham & Locher (1996)
August 29, 2001 Melanie Martin - AI Seminar 29
Ideology
The Systems– Ideology Machine -1965 to 1973 - Abelson et al.– Tale-Spin - 1976 - Meehan– Politics - 1979 - Carbonell– Pauline - 1987 - Hovy– Viewgen - 1991 - Ballim & Wilks– Tracking Point of View in Narrative - 1994 - Wiebe– Spin Doctor - 1994 - Sack– Terminal Time - 2000 - Mateas et al.
August 29, 2001 Melanie Martin - AI Seminar 30
Ideology
Some issues– Evaluation!!!– Hard-coded knowledge– Domain dependence– Cognitive plausibility– More precise definitions
August 29, 2001 Melanie Martin - AI Seminar 31
Ideology
What do we want to take with us?– van Dijk’s definitions augmented by Sack
and Wiebe– mine everything for clues to ideological
point of view
August 29, 2001 Melanie Martin - AI Seminar 32
Outline of this presentation What is AI??? Introduction and Motivation The proposed system Ideology Discourse Statistical NLP and Machine Learning Internet Conclusion
August 29, 2001 Melanie Martin - AI Seminar 33
Discourse
Now that we have a working definition of ideology and some ideas about things that might be clues, the question becomes how to find them?
First we are going to look at theories of discourse structure that might be useful.
August 29, 2001 Melanie Martin - AI Seminar 34
Discourse
Computational Linguistics – Hobbs (1979)– Mann & Thompson (RST) (1988)– Grosz & Sidner (G&S) (1986)– Morris & Hirst (Lexical chains) (1991)
Psycholinguistics– Kintsch (1994)
August 29, 2001 Melanie Martin - AI Seminar 35
Discourse Issues
– do we need it at all?– implementation
• Hobbs, G&S, RST
– finite number of fixed primitives• Hobbs, RST
– world knowledge• Hobbs
– domain specific
August 29, 2001 Melanie Martin - AI Seminar 36
Discourse
A reasonable first approach: Lexical Chains (Morris & Hirst)
Sequences of related words spanning a topical unit in the text– based on lexical cohesion– encapsulates context– helps identify key phrases
August 29, 2001 Melanie Martin - AI Seminar 37
Discourse
Lexical chains could help us in:– topic segmentation– intentional structure– lexical features for a classifier
August 29, 2001 Melanie Martin - AI Seminar 38
Discourse
Lexical chains are easy to implement, but are unlikely to be sufficient…
For the next approximation: RST– Marcu’s implementation incorporating G&S– Mostly used for summarization and
generation– Would help get at the argument structure
of the text
August 29, 2001 Melanie Martin - AI Seminar 39
Discourse
Would most likely use RST to generate features for a classifier or as input to a pattern recognizer
Nuclei spans help pick out the more important segments of text
Produces a tree that gives the structure of the rhetorical structure of the text
August 29, 2001 Melanie Martin - AI Seminar 40
Discourse
None of the discourse theories look like the are going to stand alone– may be able to give us structural, lexical
and other features – need to consider classification or clustering
based on these features– so we turn to….
August 29, 2001 Melanie Martin - AI Seminar 41
Outline of this presentation What is AI??? Introduction and Motivation The proposed system Ideology Discourse Statistical NLP and Machine Learning Internet Conclusion
August 29, 2001 Melanie Martin - AI Seminar 42
Statistical NLP and ML
Two techniques we will consider– Latent Semantic Analysis– Probabilistic Classification
August 29, 2001 Melanie Martin - AI Seminar 43
Statistical NLP and ML
Issues– clustering versus classification
• categories may not be predefined• may want to take a variety of features into
account
– favor learning over hard-coding knowledge– supervised versus unsupervised
• cost of annotated training data
August 29, 2001 Melanie Martin - AI Seminar 44
Statistical NLP and ML
Latent Semantic Analysis– text represented as a matrix
• entries are weighted frequency of word in context
– semantic space obtained through SVD• words appearing in similar context have similar
feature vectors
– characterizes semantic content of words in context
August 29, 2001 Melanie Martin - AI Seminar 45
Statistical NLP and ML
Why LSA is a good choice here– semantics is key component of ideological
discourse– clustering without need for predefined
categories– already shown useful for:
• summarization (Ando 2000)• text segmentation (Choi 2001)• measuring text coherence (Foltz 1998)
August 29, 2001 Melanie Martin - AI Seminar 46
Statistical NLP and ML
But LSA doesn’t use all of the stuff we just spent all this time talking about…
What if it doesn’t work very well? Another option is a probabilistic
classifier– assigns most probable class to an object
bases on a probability model
August 29, 2001 Melanie Martin - AI Seminar 47
Statistical NLP and ML
Probability model– defines joint distribution of variables
• set of feature variables and a class variable
Wiebe and Bruce (1995) got around the issue of not knowing the classes in advance by breaking up the problem and using a series of classifiers
August 29, 2001 Melanie Martin - AI Seminar 48
Statistical NLP and ML
Maybe this will work after all and we can use some of the features we have been talking about
Deciding which features to use can be determined statistically with goodness of fit of graphical models
August 29, 2001 Melanie Martin - AI Seminar 49
Statistical NLP and ML
Both methods seem to have a lot of potential
LSA would be easier to implement – possibly a baseline for evaluation of
probabilistic classifiers Less linguistic knowledge gain likely
with LSA
August 29, 2001 Melanie Martin - AI Seminar 50
Outline of this presentation What is AI??? Introduction and Motivation The proposed system Ideology Discourse Statistical NLP and Machine Learning Internet Conclusion
August 29, 2001 Melanie Martin - AI Seminar 51
Internet
We would like to mine the structure of the internet – see if there is a correspondence with
groups– improved IR by topic– figure out what search engine to use as a
base for our system
August 29, 2001 Melanie Martin - AI Seminar 52
Internet
Structure papers– Kleinberg (1997)– Kleinberg et al. (1999)– Terveen et al. (1999)– Whittaker et al. (1998)
August 29, 2001 Melanie Martin - AI Seminar 53
Internet
Issues– topic or query disambiguation– what is a minimal unit– how to use the structure of the web
• finding authorities• communities and subgraphs
– Evaluation!!!
August 29, 2001 Melanie Martin - AI Seminar 54
Internet
Kleinberg (1997)– link based model– hub - links to many related authorities– authority– iterative weighting algorithm that
converges (rapidly in practice)– can disambiguate authorities by sense– can be used to trawl for cyber communities
August 29, 2001 Melanie Martin - AI Seminar 55
Outline of this presentation What is AI??? Introduction and Motivation The proposed system Ideology Discourse Statistical NLP and Machine Learning Internet Conclusion
August 29, 2001 Melanie Martin - AI Seminar 56
Conclusion It seems that such a system can be built
– find a good search engine– use Kleinberg’s algorithm to improve
collection of documents retrieved– use LSA and/or a probabilistic classifier to
handle the ideological point of view– with a probabilistic classifier use features
discussed in the ideology and discourse sections
August 29, 2001 Melanie Martin - AI Seminar 57
The End
Thanks for listening!
If you want to know more, my Comprehensive Exam paper is at:
www.CS.NMSU.Edu/~mmartin/courses/comps_all.html