1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems...

51
1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Transcript of 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems...

Page 1: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

1

Peter Fox

Xinformatics – ITEC, CSCI, ERTH 4400/6400

Week 3, February 4, 2014

Information systems ‘theory’

Page 2: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Contents• Review of last class

• Discussion of reading

• Information systems theory and principles covering a range of traditional foundation aspects

• Next class(es) and assignments

2

Page 3: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Reading• http://en.wikipedia.org/wiki/Use_case

• http://alistair.cockburn.us/index.php/Use_cases,_ten_years_later

• Or questions about last week’s material?

3

Page 4: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Systems• Regardless of the type of system, be it an irrigation

system, a communications relay system, an information system, or whatever, all systems have three basic properties:– A system has a purpose - such as to distribute water to plant

life, bouncing a communications signal around the country to consumers, or producing information for people to use in conducting business.

– A system is a grouping of two or more components which are held together through some common and cohesive bond. The bond may be water as in the irrigation system, a microwave signal as used in communications, or, as we will see, data in an information system.

– A system operates routinely and, as such, it is predictable in terms of how it works and what it will produce.

4

Page 5: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Thinking in systems• Consists of primarily three things (Meadows)

– Elements– Interconnections– Function/ Purpose

• Three attributes of – Resilience– Self Organization– Hierarchy

5

Page 6: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Twelve Leverage Points• 12. Constants, parameters, numbers (such as subsidies, taxes,

standards)• 11. The size of buffers and other stabilizing stocks, relative to their flows• 10. Structure of material stocks and flows (such as transport network,

population age structures)• 9. Length of delays, relative to the rate of system changes• 8. Strength of negative feedback loops, relative to the effect they are

trying to correct against• 7. Gain around driving positive feedback loops• 6. Structure of information flow (who does and does not have access to

what kinds of information)• 5. Rules of the system (such as incentives, punishment, constraints)• 4. Power to add, change, evolve, or self-organize system structure• 3. Goal of the system• 2. Mindset or paradigm that the system — its goals, structure, rules,

delays, parameters — arises from • 1. Power to transcend paradigms

6

Page 7: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

First information system?• The first on-line, real-time, interactive, data

base system was double-entry bookkeeping which was developed by the merchants of Venice in 1200 A.D. (Bryce’s Law). ***

• Truth, author’s opinion or urban legend?

7

Page 8: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Data-Information-Knowledge Ecosystem

8

Data Information Knowledge

Producers Consumers

Context

PresentationOrganization

IntegrationConversation

CreationGathering

Experience

Page 9: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Presentation• See reading for this week

• Separation of content from presentation!!

• The theory here is more empirical or semi-empirical

• Is developed based on a solid understanding of minimizing information uncertainty beginning with content, context and structural considerations and, as we will see, adding cognitive and social factors to reduce uncertainty.

• Physiology for humans, color, …9

Page 10: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Organization

• But also, organization of information presentation, e.g. layout on a web page, in a table, or figure, or report

• Also (again) content, context and structure• Think about how you organize you

– Class notes– Calendar and assignment schedule– Your social life (just kidding)– Assignments– Do, or do not, connect with others’ ways of organizing!

• A system??– Elements, Interconnections, Function/ Purpose 10

Page 11: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

All take a deep breath

11

Page 12: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

THE PHYSICS OF INFORMATIONTHE PHYSICS OF INFORMATION

© 2005 EvREsearch LTD© 2005 EvREsearch LTD

EvREsearch©EvREsearch©

Page 13: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Equations!

13

Page 14: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Information theory• Entropy and randomness

– Critical for information system design– More on why in a few slides

14

Page 15: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Huh? Entropy?• No, you are not

in a physics class

• Information is always a measure of the decrease of uncertainty at a receiver.

15

Page 16: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Entropy and Rates• R=H(x)-H(x|y) (Shannon; 1948)• R=Rate of transmission -

measures the average ambiguity of the received signal

16

"The entropy rate of a data source means the average number of bits per symbol needed to encode it. Shannon's experiments with human predictors show an information rate of between 0.6 and 1.3 bits per character, depending on the experimental setup; the PPM compression algorithm can achieve a compression ratio of 1.5 bits per character in English text.” (wikipedia, reference in notes)

p(xi) is the probability mass function of outcome xi.

Page 17: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Not a perfect story

• Many authors criticize the use of the term entropy, and physics of information

• Information conservation, diffusion, viscosity, advection, dissipation, instability, steady state, conversion … sort of all make some sense…

• But entropy arises in thermodynamics not directly in other places 17

Page 18: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

That’s not going to stop us!• However the idea is very relevant to

– modeling (sometimes equations)– design (variables) – architecture (how they are put together)– as well as how we “condition the system”

• We’ll revisit the components of information soon but first let’s take some examples

18

Page 19: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

For information systems?• 31045?

• (03) 1045

• 783-1045

• +161397831045

• What helps reduce entropy / uncertainty?

• Notice: ‘signs’ as information representations19

Page 20: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Information integrity• We’ve seen that the information (content) of a

random variable is defined as the Sum of p x log p, where p=probability. It represents the uncertainty of the variable.

• In later classes we cover cognitive and social factors in increasing the conditional entropy and thus reducing the uncertainty and thus increasing information content and value

• We will cover semiotics (signs) as a prelude to visualization as a presentation mechanism for information 20

Page 21: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Think of web pages

21

Page 22: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Not worst but poor

22

Page 23: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

One more

23

Page 24: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Information gain/loss• The mutual information of two variables

define how much information one variable contains about the other.

• It is therefore defined as the decrease of the uncertainty of one variable by knowing the other.

• In probabilistic terms, the entropy decreases by conditioning on the distribution.

• What does this mean for an information system? E.g. a website or web service?

24

Page 25: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Information retrieval• A vast field (metrics with theory behind them)

– Precision (very relevant to this class) - is the fraction of the items retrieved that are relevant to the user's information need.

– Recall - is the fraction of the items that are relevant to the query that are successfully retrieved.

– Fall-out – is the proportion of non-relevant documents that are retrieved, out of all non-relevant documents available, e.g. 0 fall-out if you return 0 items!

– F-measure – is the weighted harmonic mean of precision and recall 25

Page 26: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Models of IR

26

Page 27: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Machine Learning• Daniel Wilkerson: “The precise study of how

to make decisions with only incomplete information is deep”.

• Part of information systems design (and architecture and implementation underneath) is to ensure people make the best and most robust decisions in the face of uncertainty.

27

Page 28: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Context• Internal - Human context, tacit knowledge

– Curiosity– User profiles– Analytics

• External – Domain context– Skill/ education context– Organizational– Procedural/ process– Unknown/ random – what is an example of this?

28

Page 29: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Structure• Is information stored or only presented?

• Structural representation of information content can bias presentation, e.g.– Modern image capture devices (digital camera)

often convert 2 byte integer to float, or 4 byte integer, what are the implications

• Appropriate choice of information structure can significantly decrease uncertainty, e.g. returning land images in GeoTIFF, which can encoding geographic location, instead of PNG 29

Page 30: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Content• Presentation

– We’ve covered a fair bit of this so far– What other factors have you thought of?

• Translation– Almost essential when transmitting between data

structures, e.g. serialization over network protocols, sometime multiple levels; HTTP, TCP/IP

• Encoding– Lossless (Huffman, entropy based, 1952)– Lossy (mpeg, jpeg) 30

Page 31: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Organizational control of content

• Of encoding standards, e.g.– Lempel-Ziv-Welch (lzw) was proprietary for many

years (until 2003) and was used in the GIF format encoding, http://en.wikipedia.org/wiki/Lempel–Ziv–Welch

– Moving Picture Experts Group (MPEG) and ISO/IEC JTC1/SC29/WG11, mpeg.org

– Joint Photographic Experts Group (JPEG), jpeg.org

31

Page 32: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Noise• Most often refers to ‘data’ but does apply to

information• Uncertainty, especially any that is introduced is a

source of noise, or more accurately – bias in the use or interpretation of the information

• Noise/ bias is context and structure dependent• Noise/ bias contamination is rampant in information

systems• Quality control and verification is less developed for

information sources, e.g. ‘people do not report problems’

32

Page 33: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Mode of noise introduction

33

From Shannon and Weaver (1949)

Information Source

Web Content, Structure

Noise source

Web browser?

HTML page, user

Msg? Signal? Recvd? Msg?

Page 34: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Means of conduct

34

Page 35: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

In Information Systems:• An example of inductive research:

– Gather data– Analyze and reanalyze the data– Organize the data within broad topics– Create categories within the topics– Identify relationships among the categories– Synthesize the patterns into conclusions

35

Page 36: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Must be inductive? (Haverty)• It does not have an existing body of theory

which typically guides the work of a field– Theory constrains acceptable solutions through

formal validation– Without it, IAs – Information Architectures tend to

treat each problem as novel

• Also, it supports emergent phenomena– The IA domain has a small set of initial

components and a relatively simple set of rules– These lead to a large number of complex

patterns 36

Page 37: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Content, structure, navigation, interaction

• in any given information system, there are many interactions that can emerge when people use it, influenced by the IA of the site

• IAs use combinations of these components to define the framework that constrains user interactions– Problem: we don’t understand well how to study

and design for emerging user experiences– We don’t know how each contributes to the user

experience

• This is why we need inductive analysis 37

Page 38: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Constructive induction (ci)• IA as constructive induction

– This is a process for generating a design solution using two intertwined searches

– First: identify the most adequate representational framework for the problem

– Second: locate the best design solution within the framework and translating it to the problem at hand

• ci is useful when existing theory cannot adequately explain the object of study

38

Page 39: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

What are the steps for applying ci?

• Well, actually, the steps are exactly those for a use case development, modeling, design and implementation

• Thus the need for experience in preparing a use case.

39

Page 40: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Interaction theory• We can come to a system with an “information task”• Problem-solving: we go through a patterned

process and end with a relevance judgment• We can also have chance encounters, encounters

with information, scanning activities• These are less patterned but still end with some

type of judgment• Then we browse, navigate, search, evaluate…• Information interaction is the basis of the person’s

use experience

40

Page 41: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Deductive Information Systems

41

Page 42: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

But wait!• We develop and implement means (designs,

architectures, systems, etc.) that perpetuate these two modes of investigation

• That’s a good thing? Right?

• Well, sometimes…

42

Page 43: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

So what about abductive IS?• This is another warm up for next class

• Abductive reasoning starts when an inquirer considers of a set of seemingly unrelated facts, armed with an intuition that they are somehow connected.

• The term abduction is commonly presumed to mean the same thing as hypothesis; however, an abduction is actually the process of inference that produces a hypothesis as its end result

43

Page 44: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Huh abduction?

Is a method of logical inference introduced by C. S. Peirce** which comes prior to induction and deduction for which the colloquial name is to have a "hunch”

Page 45: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Is abductive reasoning new?• NO – but we’ve beaten it out of modern

information systems…..

• Why?– Closed world approaches – huh?

– We’ve programmed “systems”

– Too much data/ information

– We lost sight of other options45

Page 46: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Abductive Information System?

• What would this look like?• If you consent that induction is fundamentally part of

how most (all) information system are developed, then how would you allow for abduction before induction may be possible?

46

Page 47: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Abductive Information System?• Choices?

– More or less

• Presentation?– How would that look different?

• Design factors? – TO invoke the human side

• Architecture factors?– Hide what’s not needed, but

expose what is

• Cognitive factors? 47

Page 48: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Geographic Information Systems

• Why mention a specific IS?• Geography!

– Spatial– Provides context– Provides structure– Often predetermines content

form

48

• Wikipedia: “the term describes any information system that integrates, stores, edits, analyzes, shares, and displays geographic information”

• Discuss: a lesson for constraining uncertainty!

Page 49: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Questions on?• About systems

• Information systems

• The elements of theory so far– Entropy/ uncertainty

• Content, context, structure

• Presentation, organization, noise

• Induction, deduction, abduction49

Page 50: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

Reading for this week• Is retrospective but … relates to a coming

assignment– Information entropy– Information Is Not Entropy, Information Is Not Uncertainty!– More on entropy– Context– Information retrieval– Abductive reasoning

50

Page 51: 1 Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 4, 2014 Information systems ‘theory’

What is next• Week 4 – Foundations; semiotics, library,

cognitive and social science and class exercise - information modeling

• Assignment 2

51