Post on 24-Jan-2016
description
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
From swarming to collaborative filtering.
http://www.csml.ucl.ac.uk/images/Netflix_Prize.jpg
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Informatics:
a possible parsing
Computer ScienceSTOP! ;-)
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Let’s Observe Nature!
What do you see? Plants typically branch out How can we model that?
Observe the distinct parts Color them Assign symbols
Build Model Initial State: b b -> a a -> ab
Doesn’t quite Work!
Psilophyta/Psilotum
bab
bb
b
b
bb b
aa
aa
aaa
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Complex systems approach: looking at nature
A complex system is any system featuring a large number of interacting components (agents, processes, etc.) whose aggregate activity is nonlinear not derivable from the summations of the activity of
individual components Network identity: Components form aggregate
structures or functions that requires more explanatory devices than those used to explain the components Genetic networks, Immune networks, Neural networks,
Social insect colonies, Social networks, Distributed Knowledge Systems, Ecological networks
Bottom-up Methodology Collections of simple units interacting to form a more
complex hole Study of Simple Rules that Produce Complex Behavior Discovery of Global Patterns of behavior
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
What about our plant?
An Accurate model requires Varying angles Varying stem lengths Randomness
The Fibonacci Model is similar Sneezewort:
Psilophyta/Psilotum
bab
bb
b
b
bb b
aa
aa
aaa
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Fibonacci Numbers!
Rewriting production rules Initial State: A A -> B B -> AB
n=0 : A n=1 : B n=2 : AB n=3 : BAB n=4 : ABBAB n=5 : BABABBAB n=6 : ABBABBABABBAB n=7 : BABABBABABBABBABABBAB
The length of the string is the Fibonacci Sequence 1 1 2 3 5 8 13 21 34 55 89 ...
Fibonacci numbers in Nature
Livio (2003) The Golden Ratio: The Story of PHI, the World's Most Astonishing Number
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Another example: flocking in nature
Flocking occurs when large groups of animals of the same species form aggregates that behave like a coherent, single entity Herds, flocks, schools, swarms, humans
Properties: Collective flight, migration, foraging, “drafting” Coherence: aggregate has its own
distinguishable system behavior and form Adaptive: behavior of aggregate responds and
adapts to external events (predators) Coordination: behavior of individuals seems to
be indicative of central control or symbolic/long-range communication, but isn’t
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
How to model flocking behavior?
Describing properties of aggregate behavior will only go so far: Study shapes of aggregate Situations in which it occurs Dynamics, features of behavior Biologists fixing radios?
Lessons from complex systems: Complex systems behavior: not derivable
from the summations of the activity of individual components
Network identity: Components form aggregate structures or functions that requires more explanatory devices than those used to explain the components ~ emergence
Bottom-up Methodology: Collections of simple units interacting to form a
more complex hole Study of Simple Rules that Produce Complex
Behavior
Parrish(2002) – Self-organized fish schools
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Models of flocking behavior
Boids: Craig Reynolds “Flocks, Herds and schools”, SIGGRAPH 21(4),1987
Visual model of bird flocks Lack of centralized control Lack of symbolic communication
General approach: Local computation, i.e. each individual maximizes: Collision avoidance: steer away from impact Speed matching: match speed of neighboring
birds Flock centering: steer towards perceived flock
center Flock behavior = emerges from interactions of large
groups of such construed individuals
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Ant trails: emergent organizaton driven by communication
Problem: optimize location and extraction of food source Lack of centralized control Lack of symbolic communication
General modeling approach: Local computation leads to higher order emergent
computation Walk algorithm probabilistic, but biased by pheromone
concentration Ants leave pheromone trail when food is found Pheromone evaporates with time Find shortest path
Note: ~ greedy algorithm: hill-climbing on trail strength leads to
adaptive, collective behavior Approaches to address traveling salesman problem: BIOS
group: S. Kaufmann (Santa Fe), see also M. Dorigo(2006) Ant Colony Optimization-IEEE Computational Intelligence Magazine for overview
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Abstracted: Stigmergy Stigma + ergon: sign + work
Indirect communication between various agents through environment, traces in environment Lack of centralized control Environment provides substrate for
communication, information storage Constrains individual agents
Emergence of complex, collective and goal-directed behavior Observed in social insects: termites, ants, bees Increasingly applied to social phenomena and
technology/engineeringSee:• Heylighen F. (2007). Why is Open Access Development so Successful?
Stigmergic organization and the economics of information, in: B. Lutterbeck, M. Baerwolff & R. A. Gehring (eds.), Open Source Jahrbuch 2007, Lehmanns Media, 2007, p. 165-180.
• http://www.mitpressjournals.org/doi/abs/10.1162/106454699568692
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Probabilistic cleaning: ants
Very simple rules for colony clean up Pick dead ant. if a dead ant is found pick it up (with
probability inversely proportional to the quantity of dead ants in vicinity) and wander.
Drop dead ant. If dead ants are found, drop ant (with probability proportional to the quantity of dead ants in vicinity) and wander.
Figure by Marco Dorigo in Real ants inspire ant algorithms
See Also: J. L. Deneubourg, S. Goss, N. Franks, A. Sendova-Franks, C. Detrain, L. Chretien. “The Dynamics of Collective Sorting Robot-Like Ants and Ant-Like Robots”. From Animals to Animats: Proc. of the 1st Int. Conf. on Simulation of Adaptive Behaviour. 356-363 (1990).
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Ant-inspired robots Rules (Becker et al, 1994)
Move: with no sensor activated move in straight line Obstacle avoidance: if obstacle is found, turn with a random
angle to avoid it and move. Pick up and drop: Robots can pick up a number of objects
(up to 3) If shovel contains 3 or more objects, sensor is activated and
objects are dropped. Robot backs up, chooses new angle and moves.
Results in clustering The probability of dropping items increases with quantity of
items in vicinity
Figure from R Beckers, OE Holland, and JL Deneubourg [1994]. “From local actions to global tasks: Stigmergy and collective robotics”. In Artificial Life IV.
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
becker et al experiments
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Luc Steels et al: ant algorithms
http://www.youtube.com/watch?v=93LwvuxDbfU
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Adaptive information systems
Swarm Smarts. 78. Scientific American March 2000. ERIC BONABEAU
Johan Bollen (1994): adaptive hypertext systems
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Recommender systems: general principles
• People ~ n-dimensional vectors Person = { CD/book purchases, DVDs rented, …} Vector is a representation of consumer. Entries
can be weighted (TFIDF etc) “Vector Space Model”
Calculate similarity of users: Correlation of user vectors Cosine similarity
Group consumers according to similarity: clustering
Similar users: discrepancies in vectors are recommendations
Used for all sorts of applications Similar problem to “bad of words” Multiple user personalities? Orthogonality? Same = better??
Shameboy
Plastic Operator
Angle: Consumer Similarity
[Shameboy, Plastic Operator, Figurine,…]
Buyer 1 [1, 1, 0, 0, 0,…]
Buyer 2 [1, 0, 0, 0, 0,…]
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Tracking scientists (they are people too!)
http://informatics.indiana.edu/jbollen/PLosONEmap
André Skupin
Borner/Ketan (2004)
PNAS 101(1)
Highly recommended:
http://www.scimaps.org/
Bollen J, Van de Sompel H, Hagberg A, Bettencourt L, Chute R, et al. 2009 Clickstream Data Yields High-Resolution Maps of Science. PLoS ONE 4(3): e4803. doi:10.1371/journal.pone.0004803
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
documents
interface
We’re all ants now?• User vectors:
Represent individual trail/exploration in n-dimension information space
Recommender systems: bias probabilistic exploration paths of users based on
others’ actions Higher probability of following existing trails
Analogy: Set of user vectors + recommender system ~ ant trails Solving traveling salesman in n dimensions? ;-)
Modeling fads, hypes, flashcrowds in cyberspace, self-fulfilling prophecies, but also long tail effects, more optimized exploration of information space?
Which features of recommender systems promote either of the above?
Cf. youtube.com: “other users are watching” vs. batch-processed recommendations
Emergence of COMPUTATIONAL SOCIAL SCIENCE http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2745217 / Lazer et al (2009). Life in the network – the coming age
of computational social science. Science. 2009 February 6; 323(5915): 721–723.
recommender
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Next week readings
1. Gouth (2009) Training for Peer Review. Science Signaling 2 (85), tr2. [DOI: 10.1126/scisignal.285tr2]
2. MONASTERSKY (2005) The number that is devouring science. Chronicle of higher education, Section: Research & Publishing Volume 52, Issue 8, Page A12
3. Eysenbach G, 2006 Citation Advantage of Open Access Articles. PLoS Biol 4(5): e157. doi:10.1371/journal.pbio.0040157
4. Lance Fortnow (2009) Time for Computer Science to Grow Up. Communications of the ACM, august, 52(8) doi:10.1145/1536616.1536631
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Next up: your proposal assignment
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Your proposal assignmentFrom the syllabus:
“You have a dream. Write it down in the form of a proposal for the NSF Graduate Research Fellowship Program (CISE field of study). This proposal accounts for 20% of your final grade. 10% for the final presentation in class.”
What is the NSF Graduate Research Fellowship Program ?
“The program recognizes and supports outstanding graduate students who are pursuing research-based master's and doctoral degrees in fields within NSF's mission. The GRFP provides three years of support for the graduate education of individuals who have demonstrated their potential for significant achievements in science and engineering research. The ranks of NSF Fellows include individuals who have made transformative breakthroughs in science and engineering research and have become leaders in their chosen careers and Nobel laureates.”
http://www.nsf.gov/pubs/2010/nsf10604/nsf10604.htm
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
How does it work?
NSF wants:• Personal Profile,• Education and Work Experience, • Planned Graduate Program,• ** Personal Statement (2p)• ** Previous Research Experience (2p)• ** Proposed Plan of Research and References (2p)
I want:
1. The items marked with **, i.e. a total of 6 pages + references = 20% of grade
2. A 15’ in-class presentation of your work (December 1 and December 8) = 10% of grade
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Formatting
• ** Personal Statement (2p)• ** Previous Research Experience (2p)• ** Proposed Plan of Research and References (2p)
• Maximum length of two pages, including all references, citations, charts, figures, and images.
• Standard 8.5" x 11" page size, 12-point, Times New Roman font, 1" margins on all sides, and must be single spaced or greater
• No hyperlinks, only citations in References Cited section. Images may be included in the page limits.
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Personal statement
Important questions to ask yourself before starting the essay:• Why are you fascinated by your research area?• What examples of leadership skills and unique characteristics
do you bring to your chosen field?• What personal and individual strengths do you have that make
you a qualified applicant?• How will receiving the fellowship contribute to your career
goals?• How do these activities address the Intellectual Merit and
Broader Impacts criteria?
Example:
http://www.mitbrandon.com/nsfstatement.shtml
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Previous Research Experience
Important questions to ask yourself before starting the essay:• What are all of your applicable experiences?• For each experience, what were the key questions,
methodology, findings, and conclusions?• Did you work in a team and/or independently?• How did you assist in the analysis of results?• How did your activities address the Intellectual Merit and
Broader Impacts criteria?
Example:
http://rachelcsmith.com/NSF/DisturbancePRE.pdf
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Proposed Plan of Research
Review criteria:
http://www.nsfgrfp.org/how_to_apply/review_criteria
http://www.nsf.gov/pubs/2002/nsf022/bicexamples.pdf
Intellectual merit:
How important is the proposed activity to advancing knowledge and understanding within its own field or across different fields?
How well qualified is the proposer (individual or team) to conduct the project? (If appropriate, the reviewer will comment on the quality of prior work.)
To what extent does the proposed activity suggest and explore creative, original, or potentially transformative concepts?
How well conceived and organized is the proposed activity?
Is there sufficient access to resources?
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
Proposed Plan of ResearchReview criteria:
http://www.nsfgrfp.org/how_to_apply/review_criteria
http://www.nsf.gov/pubs/2002/nsf022/bicexamples.pdf
Broader Impacts
– Activities and projects that:
How well does the activity advance discovery and understanding while promoting teaching, training, and learning?
How well does the proposed activity broaden the participation of underrepresented groups (e.g., gender, ethnicity, disability, geographic, etc.)?
To what extent will it enhance the infrastructure for research and education, such as facilities, instrumentation, networks, and partnerships?
Will the results be disseminated broadly to enhance scientific and technological understanding?
What may be the benefits of the proposed activity to society?
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
General words of wisdomYou need to mind the formal criteria/check lists etc but what really
matters:
- Don’t so much focus on the task or burden of writing a proposal, but on the pleasure of outlining an interesting, relevant and successful research agenda.
- You are asking for support ($$$). Someone will make a decision to support your research. They need to a see compelling reason to do so. Your essay must make a good scientific and societal case for why one should invest in your idea and professional development.
- Make clear that you are qualified and well-positioned to execute what you propose.
- Start with the big issues. The limit is 2 pages, so try to be as succinct and to the point as you can. Make it work. Focus on the why, then on the how.
- Quality of exposition matters. Don’t annoy reviewers with jargon, crummy grammar, overly long sentences.
- Be mindful of your audience. Your reviewers will be experts but not the degree that you may be.
I501 – Introduction to Informatics
jbollen@indiana.eduhttp://informatics.indiana.edu/jbollen/I501
Informatics and computing
Lecture 8 – Fall 2010
About assignment 2
Some changes you want to be mindful of:- NEW deadline = December 1st, 4PM (16:00)- Submission: two types
- Partial: regular submission of Assignment 2 as planned. Graded on 25 point scale. Grade for assignment 1 is maintained.
- Full: submission of 1 single assignment that comprises and integrates both assignments 1 and 2. Graded on 50 point scale. Expectation: SIGNIFICANT improvement in portion relevant to assignment 1. I will grade accordingly. Correct answer to algorithm is NOT sufficient.