Post on 05-Sep-2020
Analysing Qualitative data
Amela Karahasanovićamela@sintef.no
Don't forget the quiz on working with humansand qualitative analysis on Friday!
2
Plan for today
• Advantages and disadvantages of qualitative data
• Grounded theory– Experiment research versus grounded theory– How to do it?
• Analyzing the content – coding• Ensuring the high quality analysis
– Validity– Reliability
3
• Do you like my app?
4
• What they are doing when they are using your app?
• What kind of problems they might have?• What they do in their spare time?
5
• Quantitative data: time, error rate, rankings
6
0
5
10
15
20
25
30
35
40
45
1 3 5 6 7 10 11 12 13 17 18 20 21 23 24 25 26
Unified Library Application - QuitUnified Library ApplicationTitle InformationReturnItemReserve TitleLend ItemInsert Title WindowFind TitleCompilation
Subject_id 15 Task 2
Sum of Seconds
Minutes
Visited
Data collection methods
• Participant observations• Interviews• Documentation (text, pictures)• Video and audio material• Diaries• Open-ended questions in surveys
7
Qualitative data
• Data represented as words and pictures, not numbers
• Coming from educational and social sciences to study the complexity of human behaviour– Motivation, communication, understanding
• HCI blends technology and human behaviour
8
Advantages and disadvantages
Enables digging into the complexity of the problems rather than abstracting it The results are richer and more informative Qualitative data analysis is more labor-
intensive than quantitative analysis The results are considered "softer" and
"fuzzier" The results are more difficult to summarize or
simplify
9
Decision support tools in ATM
Interviews, observer notes, survey video and audio material, screen captures, log files
10
14:04 What prohibited you in making the optimal decisions? Definitely, the radar screen was very important, By that I could decide if an airplane was ready for push back. The human/machine interface is important. This update took too long time, maybe 2-3 sec. It should be 1 sec maximum. The interfaces are important, overlapping aircraft symbols, is confusing, the clarification is not clear. My mental work flow will slow down. Another point is that the preview of what is coming next has to be well defined.
• Four controllers – 30 minutes each; questions about decisions, the process in control tower, the tools that were used, the experiment
11
Stages of qualitative analysis
1. Start with the data set containing the information the substance (e.g. communication between ATCOs) and identify its major components (e.g. giving instruction, asking for more information, negotiating)S
2. Study the properties and dimensions of each component (nature of the component, their relationship)
3. Use the knowledge about each component the understand the original substance
12
Online behaviour of internet users
• First stage – users behaviour affected by their personality, education, computer experience
• Second stage – study each of them, reads literature describing types of personality, their development, effects on social behaviour
• Third stage – go back and examine how each of these components influence online behavior
• OBS! Experience of the researcher is critical for discovery process
13
Grounded theory
• Not a theory• Qualitative research method• Goal – develop a theory grounded in a
systematically collected and analyzed data • Used both as a methodology or just some
instructions like coding procedure• Can be applied to different methods
(ethnography, case studies and interviews)
14
GT - Inductive method different from experimental research
Experimental research:Theory -> Hypothesis -> Study -> Data-> Y/N
Grounded theory:Study ->Data -> Theory
– Several rounds; reverse engineering
15
Grounded theory cont.
• GT is simply the discovery of emerging patterns in data
• Conceptualizing patterns and acting in terms of them
• Sequential and iterative
16
Remember!
• No pre-formed hypothesis!• No favorite solutions! • Creativity and open mind!• Let the data to lead you!
17
Procedures for grounded theory
• Open coding– Identify phenomena
• Development of the concepts– Group phenomena into concept
• Grouping concepts into categories– Grouping and interpretation
• Formation of a theory– Create inferential and predictive statements on
phenomena
18
Advantages
• Systematic approach to analysis of qualitative data
• Allows generating theory grounded in data and coding
• One can study data early on, and formulate and refine the theory through constant interplay between data collection and analysis
19
Disadvantages
• One can be overwhelmed by details• Theories might be difficult to evaluate
– Textual data, less strict measures, coding
• Might be biased
• Keep in mind– Be open-minded and creative– Listen to data
20
Analysis - coding
• More than word counting• "Involves interacting with data, making
comparisons between data, and so on, and in doing so, deriving concepts to stand for those data, then developing these concepts in terms of their properties and dimensions"
21
Coding
• Extracting values for quantitative variables from qualitative data in order to perform some quantitative analysis
• Quantitative data are objective and qualitative data are subjective
• Subjectivity/objectivity orthogonal to whether the data is qualitative or quantitative
22
Examples
• "Ola, Kari and Rune were the only participants at the meeting" -> num_participants = 3
• "Kari said that this particular INF2260 lesson was really easy to understand, and not very complex at all compared to other classes" -> complexity = low
23
Problems
• Possible loss of information• Subjects use different words for the same
phenomenon or the same word for different phenomena
• Subject use straightforward words that mask the meaning – "Low complexity of the code" -> easy to read, easy to
understand or just small• Things can be rated differently by different
subject (average, high, low )
24
A priory
• Codes from the literature• Analysis• Several coders• Reliability check if coding is consistent• Work fine for known domains
25
A priory coding
"Okay, it worked well. Then I looked through the class diagrams, okay. . . Then I understood better how it worked. But, okay, after that I looked at the task. Just make changes in. . ." ->search, explore, action
26
Emergent coding
• Appropriate for new topics• Several researchers examine the data and
develop key coding categories• Comparison, discussion, common list• Multiple coders do the coding• Reliability measures calculated; if ok proceed
with the coding; if not go back
27
Emergent coding
28
" Definitely, the radar screen was very important. By that I could decide if an airplane was ready for push back. The human/machine interface is important. This update took too long time, maybe 2-3 sec. It should be 1 sec maximum. " ->push back decision, waiting
Identifying coding categories
• Very important as they lead the analysis• Demanding• Codes are coming from
– Theoretical framework– Researchers interpretation (research denoted
concepts)– Participants (in-vivo codes)
29
Theoretical framework
• We start research by literature review and identifying theoretical framework related to our research topic
• Difficulties experienced by senior citizens when using computers– Human capabilities : cognitive, physical,
perceptual• Taxonomies
– Categories of users, tasks, errors
30
Researcher denoted concepts
• Identify patterns, opinions, behaviour in your data -> codes
• Open coding
• "I was looking for 'find' …and it was not there…It was so irritating" -> find, frustration
31
• In vivo-codes– Participants have a good descriptions– Use it in your coding– "Curriculum integration" from the one parent's
response, name of a TV show in the analysis of QoE• Building a code structure
– Participants express same ideas id different ways– Code list – nomenclature– Several levels (different levels of details)
32
Coding the text
• Read the text (watch the video) before start the coding
• Difficult to find anything interesting – too many interesting things
• Procedure:– Look for specific items– Ask questions constantly about the data– Make comparisons constantly at various levels
33
Look for key items
• Some statements have more valuable information– Objectives: computers for education– Actions: click on– Outcomes: error message appeared– Consequences: I stopped using it– Causes: my old laptop– Context: I was in bus– Strategies: I first browse
34
Ask questions about data
• The art of asking questions in a larger context– Sensitizing questions
• What is happening here? What did the user click? How did she reach www.ifi.uio.no?
– Theoretical questions• What is relationship between two factors? How does
interaction change over time?
35
Making comparisons of data
• Compare instances under different categories– Frequency for different capabilities (physical,
cognitive, perceptual) for elderly • Compare the results between different groups
– Age, background, family support• Compare to the previous published results
– Same/contradictory, related studies• Computer software
– NVivo, Concordance, SPSS TextSmart
36
Ensuring high-quality analysis
• Subjective analysis– Which category? Are they in the same group? Is 'good'
and 'ok' the same?• Validity
– Use of well-established and well-documented procedures to increase the accuracy of findings; Did we get it right?
• Reliability– Consistency of results; Would other researchers
make the same conclusions based on the same data set?
37
Validity
• Construction of a database with the collected data material: raw data (notes, documents, photos…) results of the analysis
• Also increase reliability• Data source triangulation
– Interviews, observations, diaries
• Avoid having pet theories• Consider alternative theories
38
Reliability
• Same word might have different meanings• Body language, face expressions, drawings
might have different meanings• Large studies -> different coders analyse
different data subsets• Different people should code in the same way
39
• Intra-coder reliability (stability)– Whether the same coder do the same throughout
the whole process; Would he do the same next time? (50% A, 30% B, 20% C)
• Inter-coder reliability (reproducibility; investigator triangulation)– Whether different coders would do the same?
Multiple coders with different backgrounds
40
• To achieve good reliability– Good coding instructions– Training– Test on the limited amount of data
• Reliability measure % agreement = the number of cases coded the same way by multiple coders/ the total number of cases
41
• Coders can do the same by chance• Cohen's Kappa (0-1; 0 – coded the same by
chance; 1 – perfect reliability
• K=(Pa – Pc)/(1-Pc)• Pa – percentage of cases on which the coders
agree• Pc – percentage of agreed cases by chance• More that 60% is satisfactory
42
43
Coded by both coders
Coded by
chance
0.26, 0.12, 0.35, 0.14, codded the same by both coders
Expected agreement when the data is coded by chance 0.37*0.39=0.14K=(Pa – Pc)/(1-Pc)Pa = 0.26 + 0.12 + 0.35 = 0.73Pc = 0.14 + 0.04 + 0.18 = 0.36K= 0.58 K>0.6 would be ok
7% coded physical by
coder 1, and
cognitive by coder 2
Subjective versus objective coders
• Subjective/inside coders– Designed the study, developed the coding
scheme, collected the data– (+) know the literature; know the topic; easier to
interpret the data; minimal training– (-)might be biased and unable to see new
patterns, new behaviour
• Objective/outside coders
44
Multimedia content
• Image, audio, video, screen shots• Cursor movements tracks, facial expressions,
gestures, intonation provides rich pool of data• Extremely time consuming• Same principles as for the text analysis: study
the literature, define the scope, context and objectives; identify key instances you want to annotate, analysis; evaluation of the reliability
45
Approaches
• Manual annotation– Labor-intensive, might be affected by coder's subjectivity
• Partially automated annotation– Humans code some sequences that are used to train the
application to establish the relationship between low-level features and high-level concepts
• Fully automated annotation– Highly error-prone
• Other tools: annotation pictures when taken on mobile phone, organizing pictures in a spreadsheet
46
Exercise 1 – group work
• Take your data • Make a plan for applying GT for analysis of
your data• Make two iterations with your data• What is your experience? Advantages?
Disadvantages?
47
Exercise 2 – group work
• Do coding for another group– Two coders
• What are the problems? Compare the results. What was the agreement. Which cases you disagreed on? What you can say about reliability? What you have done/can do to assure validity?
48