HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer...

28
January 19, 2005 Weblog research HCS lab Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam http://anjo.blogs.com Many thanks to Lilia Efimova, Rogier Brussee, Robert de Hoog, Stephanie Hendrick and the blogosphere in general
  • date post

    18-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    1

Transcript of HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer...

Page 1: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

Weblogs for Research(ers)Anjo Anjewierden

Human Computer Studies laboratoryFaculty of Science

University of Amsterdamhttp://anjo.blogs.com

Many thanks to Lilia Efimova, Rogier Brussee, Robert de Hoog, Stephanie Hendrick

and the blogosphere in general

Page 2: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

What is a weblog (1)?

• Most common descriptive definition: a weblog is– a personal journal,– updated regularly,– published on the internet; and– posts (entries) appear in reverse

chronological order

Page 3: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

What is a weblog (2)?

• Weblogs are social as they encourage others to participate using two mechanisms:– Posts have an explicit point of reference

called a permalink– Permalinks make it possible for people to

link to each other’s posts: share and discuss– Readers, possibly without a weblog, are

invited to join as all posts have a comment link

Page 4: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

Anatomy of Weblogs

• For example: my weblog

Page 5: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

Weblog Research is about …

• Humans who share findings, thoughts, ideas and sometimes feelings in their weblogs

• Computers which make it possible to create weblogs, read weblogs, and to comment and to link

• Studies which analyse why and how people blog about what and to whom

• Laboratory: weblog researchers need a stable environment in which to conduct their research

Page 6: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

Do we want to research weblogs …

• Blog (short for weblog, we-blog) was word of the year 2004 by Merriam Webster. To blog, blogger, blogging, blogosphere, etc.

• Communications of the ACM (CACM) carried a special issue on weblogs (December 2004)

• Unfiltered and Public For the first time we get access to a large body of material on a particular person, written by that same person

• Research relevance Social studies, Knowledge Management (for professional weblogs), education, linguistics … and even Semantic Blogging (combining Semantic Web and blogging) has been coined

• Compare Digital Cities research by Beckers / Van den Bersselaar (at SWI)

Page 7: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

BlogTrace the Laboratory (1)

• Weblogs are represented as HTML pages– Complex layout, difficult to find the posts– Manual research is extremely labour

intensive– There is a serious lack of tools that support

weblog research

Page 8: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

BlogTrace the Laboratory (2)

• BlogTrace spider makes data collection and research a lot easier– Automatically extracts posts from the HTML– Generates the link structure of the weblog

and represents it as RDF/OWL– Generates an RSS feed that contains all

posts for a weblog– Implemented using induction algorithms,

which learn what are posts and what is layout

Page 9: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

Ontologies used in BlogTrace

• DC: Dublin core (names, dates, descriptions)• FOAF: Friend of a friend (documents, people)• RSS 1.0 (RDF): Really simple syndication

(representation of full posts)• Link ontology, for example a link (href in

HTML) becomes:– Link link:sourceDocument <http://…/>;– Link link:targetDocument <http://…/>;– Link link:anchorText “interesting site”;– Etc.

Page 10: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.
Page 11: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

Weblogs can now be studied …

• Even using Semantic Web technology (RDF/OWL)

link:WeblogPostLink rdfs:subClassOf link:SimpleLink; rdfs:comment "A WeblogPostLink is a SimpleLink if and only if both the source and the target documents are weblog posts (RSS items)."; rdfs:label "WeblogPostLink"; owl:intersectionOf (link:SimpleLink [ a owl:Restriction;

owl:onProperty link:sourceDocument; owl:someValuesFrom rss:item ] [ a owl:Restriction; owl:onProperty link:targetDocument; owl:someValuesFrom rss:item ]).

link:WeblogPostLink rdfs:subClassOf link:Link; rdfs:comment "A WeblogPostLink is a Link if and only if

both the source and the target documents areweblog posts (RSS items)";

owl:intersectionOf (link:Link[ a owl:Restriction; owl:onProperty link:sourceDocument; owl:someValuesFrom rss:item

][ a owl:Restriction; owl:onProperty link:targetDocument; owl:someValuesFrom rss:item]).

Page 12: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

Some Weblog Research Questions

• Weblog communities– Do they exist?– How can they be defined and found?– What is the social structure?– What are the conventions in the community?

• Text analysis of weblogs– What do people blog about (terms, topics)?– Do they share terminology?– Can personal conceptualisations be extracted?

• Conversations– Can linked weblog posts be seen as conversations?– Can we identify when there is a “knowledge flow”?

Page 13: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

Implementations and Papers

• Weblog communities:– Visual Settlements– Graphically displays weblog community linkage based on a

“weblog is a city” metaphor– Community determined by “Virtual Settlements” paper

(Efimova & Hendrick, 2005)• Text analysis of weblogs:

– Sigmund (Anjewierden, Brussee & Efimova, 2004)– Co-occurrence based statistical algorithm that identifies

concepts and their relations for a weblog • Conversations:

– Knowledge flows (Anjewierden, De Hoog, Brussee & Efimova, 2005)

– Hypothesis: chance of a knowledge flow is greater when the sender and receiver share conceptualisations

Page 14: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

Visual Settlements

• Idea– Can we compress a weblog to a single picture?– Such that we can use the picture to compare it to

other weblogs in a community– And, of course, learn something …

• Inspiration– Maps in general– Books by Edward Tufte on “Information Design”

• The Visual Display of Quantitative Information (1983)• Envisioning Information (1990)• Beautiful Evidence (2005; forthcoming)

– (Discovered Tufte by blog reading)

Page 15: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

My blog as a Visual Settlement

Page 16: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

Anatomy of Visual Settlements

Without links in the community (house)

I link to someone (I’m at work)

Someone links to me (I’m in the park)

Size: number of words in the post

Layout: if I link to earlier posts they are close

Time: early post in center, radiate outwards

Page 17: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.
Page 18: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.
Page 19: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.
Page 20: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

Sigmund

• Idea– Using co-occurrence to determine whether terms are

related– Related terms might point to conceptualisations of

the blogger– And, these conceptualisations might be shared by

other bloggers

• Supported by– Tools that are part of my regular research on

methods to support ontology development from documents

– In particular: term extraction and named entity recognition

Page 21: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

Making a Difference

• Idea– In a community of bloggers it is likely terminology is

shared– Finding the shared terms is interesting (see

Sigmund)– But a blogger is a person and not a web page– So, what makes them different?

• Implementation– Run Sigmund on all blogs in a community– Find terms that are common for a particular blog

and not common for others in the community– Example: Making a Difference post

Page 22: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

Knowledge Flows

• Idea and Motivation– When bloggers link to a post of other bloggers– Could it be a “knowledge flow”?– Motivated by potential use as a knowledge

management tool

• Implementation– Use Sigmund’s co-occurrence algorithm– Term overlap in linked posts is the main metric– Make a distinction between shared and agreed

terms (used by both bloggers) and private terms (used by one of blogger)

Page 23: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

Knowledge Flows

• Idea and Motivation– When bloggers link to a post of other bloggers– Could it be a “knowledge flow”?– Motivated by potential use as a knowledge

management tool

• Implementation– Use Sigmund’s co-occurrence algorithm– Term overlap in linked posts is the main metric– Make a distinction between shared and agreed

terms (used by both bloggers) and private terms (used by one of blogger)

Page 24: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.
Page 25: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.
Page 26: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

Weblogs for Researchers

• Experiment (Metis project)– Six researchers (previously non-bloggers) started a

weblog to get hands-on experience– Two gave up rather early– One thinks about underpants when blogging– Three (includes myself) continue after the

experiment finished

• Evaluation– Posts are not emails (everybody can read them!)– Posts are not academic papers– Developing a blogging style (how and about what

you blog) is difficult and different for everybody

Page 27: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

Conclusions (1)

• Blogging as a tool for researchers– Try it!– Works for me, both reading and writing– By sharing ideas on your blog, you may get

help!

Page 28: HCS lab January 19, 2005Weblog research Weblogs for Research(ers) Anjo Anjewierden Human Computer Studies laboratory Faculty of Science University of Amsterdam.

January 19, 2005 Weblog research

HCS lab

Conclusions (2)

– Enormous amount of data (paradise for someone like me)

– Tempting to continue my own weblog research

– If others have better ideas than I have, and some do, I gladly return to my role as supporting others to do their weblog research