Outline models Hidden Markov Models · 2012-05-22 · Hidden Markov Models DepmixS4 Examples...

10
Hidden Markov Models DepmixS4 Examples Conclusions depmixS4: an R-package for hidden Markov models Ingmar Visser 1 & Maarten Speekenbrink 2 1 Department of Psychology University of Amsterdam 2 Department of Psychology University College London Psychometric Computing, February 2011, Tuebingen depmix Hidden Markov Models DepmixS4 Examples Conclusions Outline Hidden Markov Models DepmixS4 Examples Speed-accuracy trade-off Dynamic Change Card Sorting depmix Hidden Markov Models DepmixS4 Examples Conclusions Example model S1 S2 O11 O21 S3 O12 O22 O13 O23 I S 1 , S 2 ,...: discrete states (latent or hidden) I O 11 , O 21 , O 12 ,...: observations (yes/no, RT, . . . ) I For example: O 11 , O 21 are items on a balance scale task I States represent different strategies that change through learning I Dependency between S and O forms the measurement model I Dependency between S’s forms the dynamic part of the depmix Hidden Markov Models DepmixS4 Examples Conclusions Dependent mixture model formulation S1 S2 O11 O21 S3 O12 O22 O13 O23 A A B B B B B B 1. S t = AS t -1 + ξ t , A, a transition matrix 2. O t = B(S t )+ ζ t , B, an observation density 3. Pr (S t |S t -1 ,..., S 1 )= Pr (S t |S t -1 ) (Markov assumption) depmix

Transcript of Outline models Hidden Markov Models · 2012-05-22 · Hidden Markov Models DepmixS4 Examples...

Hidden Markov ModelsDepmixS4Examples

Conclusions

depmixS4: an R-package for hidden Markovmodels

Ingmar Visser1 & Maarten Speekenbrink2

1Department of PsychologyUniversity of Amsterdam

2Department of PsychologyUniversity College London

Psychometric Computing, February 2011, Tuebingen

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Outline

Hidden Markov Models

DepmixS4

ExamplesSpeed-accuracy trade-offDynamic Change Card Sorting

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Example model

S1 S2

O11 O21

S3

O12 O22 O13 O23

I S1,S2, . . .: discrete states (latent or hidden)I O11,O21,O12, . . .: observations (yes/no, RT, . . . )I For example: O11,O21 are items on a balance scale taskI States represent different strategies that change through

learningI Dependency between S and O forms the measurement

modelI Dependency between S’s forms the dynamic part of the

model

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Dependent mixture model formulation

S1 S2

O11 O21

S3

O12 O22 O13 O23

A A

BB B B B B

1. St = ASt−1 + ξt ,A, a transition matrix2. Ot = B(St) + ζt ,B, an observation density3. Pr(St |St−1, . . . ,S1) = Pr(St |St−1) (Markov assumption)

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Likelihood

Pr(O1, . . . ,OT ) =∑

q

T∏t=1

Pr(Ot |St ,A,B),

q an arbitrary hidden state sequence

I q: an enumeration of all possible state sequences (nT )I Leave out the sum over q (St known): complete data

likelihoodI Note: likelihood is not computed directly (impractical for

large T )

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Relationship to other models

1. Latent Markov model2. Dependent Mixture model3. Bayesian network (with latent variables)4. State-space model (discrete)5. Symbolic dynamic model6. Regime switching models7. . . .

S1 S2

O11 O21

S3

O12 O22 O13 O23

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Why do we need them?

1. Piagetian development,conservation, balance scale

2. Concept identification learning3. Strategy switching:

Speed-accuracy trade-off4. Iowa Gambling task5. Weather Prediction task6. Climate change

[Jansen and Van der Maas, 2002]

5 10 15 20

0.0

0.2

0.4

0.6

0.8

1.0

Backwards learning curve

trials

pro

port

ion e

rrors

[Schmittmann et al., 2006]

5.0

5.5

6.0

6.5

7.0

rt0.0

0.4

0.8

corr

0.0

0.2

0.4

0.6

0.8

0 50 100 150

Pacc

Time

speed

[Dutilh et al., 2011,Visser et al., 2009]

Perth dams water inflow

year

GL

1920 1940 1960 1980 2000

0200

400

600

800

Water Corporation ofWestern Australia

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

DepmixS4

I R-packageI depmixS4 fits dependent mixture modelsI mixture components are generalized linear models (and

others . . . )I Markov dependency between components

In short: depmixS4 fits hidden Markov models of generalizedlinear models in both large N, small T as well as N=1, T largesamples.[Visser and Speekenbrink, 2010]

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Transition & initial modelEach row of the transition matrix and the initial stateprobabilities:

I is modeled as a multinomial distributionI uses the logistic link function to include covariatesI can have time-dependent covariates

-1.0 -0.5 0.0 0.5 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Transition probability

covariate

probability

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Response modelsCurrent options for the response models are models from thegeneralized linear modeling framework, and some additionaldistributions.From glm:

I normal distribution; continuous, gaussian dataI binomial (logit, probit); binary dataI Poisson (log); count dataI gamma distribution

Additional distributions:I multinomial (logistic or identity link); multiple choice dataI multivariate normalI exgaus distribution (from the gamlss package); response

time dataI it is easy to add new response distributions

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Optimization

depmixS4 uses:

I EM algorithm (interface to glm functions in R)I Direct optimization of the raw data log likelihood for fitting

contrained models (using Rsolnp)

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Speed-accuracy trade-offDynamic Change Card Sorting

Speed-accuracy: data

5.0

5.5

6.0

6.5

7.0

rt0.0

0.4

0.8

corr

0.0

0.2

0.4

0.6

0.8

0 50 100 150

Pacc

Time

speed

I Three blocks with N=168,134,137 trials (first block shown)I Speeded reaction time taskI Speed and accuracy manipulated by reward variableI Question: is there a single (linear) relationship between

responses and covariate or switching between regimes?More data in: [Dutilh et al., 2011]depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Speed-accuracy trade-offDynamic Change Card Sorting

Speed-accuracy: linear model predictions

Model predicted RTs

trial

RT

0 50 100 150

5.0

5.5

6.0

6.5

7.0

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Speed-accuracy trade-offDynamic Change Card Sorting

Speed-accuracy: two-state model

I FG=fast guessingI SC=stimulus controlledI Response times also modeledI Pay-off for accuracy as covariate on the transition

probabilities

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Speed-accuracy trade-offDynamic Change Card Sorting

Speed-accuracy: two-state model

Fitting this in depmixS4:mod1 <- depmix(list(rt~1,corr~1),+ data=speed, transition=~Pacc, nstates=2,+ family=list(gaussian(), multinomial("identity")),+ ntimes=c(168,134,137))

fm1 <- fit(mod1)

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Speed-accuracy trade-offDynamic Change Card Sorting

Speed-accuracy: switching model

Model predicted RTs

trial

RT

0 50 100 150

5.0

5.5

6.0

6.5

7.0

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Speed-accuracy trade-offDynamic Change Card Sorting

Speed-accuracy: transition probabilities

Transition probability functions

Pacc

0.0 0.5 1.0

0.0

0.5

1.0

P

5.5

6.0

6.5

RT

P(switch from FG to SC)P(stay in SC)RTs on increasing PaccRTs on decreasing Pacc

I transition probabilities as function of covariateI hysteresis: asymmetry between switching from FG to SC

and vice versa

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Speed-accuracy trade-offDynamic Change Card Sorting

What is DCCS?

I task 1: sort by colorI task 2: sort by shapeI measures: ability to

switch/flexibility

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Speed-accuracy trade-offDynamic Change Card Sorting

DCCS: data

I data consists of 6 trials(task 2)

I traditional analyses:1. 0/1 correct:

perseveration2. 5/6 correct: switching

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Speed-accuracy trade-offDynamic Change Card Sorting

DCCS: research questions

I can we characterize theremaining group?

I are there children intransition, shifting fromone strategy toanother?

I alternative: are theysimply guessing?

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Speed-accuracy trade-offDynamic Change Card Sorting

DCCS: theory

I cusp model predictsinstability in thetransitional phase

I shifting back and forthbetween ‘strategies’

I hysteresis: assymetryin transitionprobabilities

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Speed-accuracy trade-offDynamic Change Card Sorting

DCCS: results

I P: perseveration stateI S: switch stateI transition P->S much

larger than transitionS->P

I this model better than amodel withouttransitions

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Speed-accuracy trade-offDynamic Change Card Sorting

DCCS: results

I 30 to 40 % of 3/4 yearolds are in thetransitional phase,shifting betweenstrategies

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Speed-accuracy trade-offDynamic Change Card Sorting

Other applications

I Climate change dataI Learning on the Iowa Gambling TaskI Balance scale taskI Categorization learning

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Categorization learningBalance scale

Take home messages

I depmixS4 can be downloaded from:http://r-forge.r-project.org/depmix/ or from CRAN

I Also models with transient and absorbing statesI Easy to add your own favorite distributionI Paper in Journal of Statistical Software: depmixS4: An

R-package for Hidden Markov ModelsI This is not a psychometrics package, rather a

psychodynamics package

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Categorization learningBalance scale

Thanks

I Thanks to Han van der Maas for the speed-accuracy dataI Thanks to Bianca Beersma for the DCCS data (paper in

press!)I Happy mixing!

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Categorization learningBalance scale

Dutilh, G., Wagenmakers, E.-J., Visser, I., and van der Maas, H. L. J. (2011).A phase transition model for the speed–accuracy trade–off in response time experiments.Cognitive Science.

Jansen, B. R. J. and Van der Maas, H. L. J. (2002).The development of children’s rule use on the balance scale task.Journal of Experimental Child Psychology, 81(4):383–416.

Schmittmann, V. D., Visser, I., and Raijmakers, M. E. J. (2006).Multiple learning modes in the development of rule-based category-learning task performance.Neuropsychologia, 44(11):2079–2091.

Visser, I., Raijmakers, M. E. J., and Van der Maas, H. L. J. (2009).Hidden markov models for individual time series.In Valsiner, J., Molenaar, P. C. M., Lyra, M. C. D. P., and Chaudhary, N., editors, Dynamic ProcessMethodology in the Social and Developmental Sciences, chapter 13, pages 269–289. Springer, New York.

Visser, I. and Speekenbrink, M. (2010).depmixS4: An R-package for hidden Markov models.Journal of Statistical Software, 36(7):1–21.R package, current version available from CRAN.

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Categorization learningBalance scale

Future developments

1. richer measurement models, eg factor models, AR modelsetc

2. richer transition models, eg continuous time measurementoccasions

3. explicit state durations4. identifiability of models5. model selection

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Categorization learningBalance scale

Experiment

I categorization learning experimentI 32 learning blocks with feedbackI 5 transfer blocks without feedback

Research questions:1. detect different patterns of generalization (rule-based vs.

exemplar-based)2. study representational shifts in learning: does

representational format change with learning?

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Categorization learningBalance scale

Data: transfer trials

1. 32 possiblegeneralization patterns(for 5 binary items)

2. Rule 1: AABBB3. Rule 2: BBABA4. Exemplar: ABBBA5. Occurrence of

exemplar-basedresponding increases

6. Data fairly noisy

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Categorization learningBalance scale

Model specification

I 3 states representing Rule 1,Rule 2 and Exemplar basedresponding

I models with 2 to 5 states werefitted

I model selection by BIC

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Categorization learningBalance scale

Results (1)

I the selected model hasanother state correspondingto guessing behavior

I the model includes acovariate on the probability’correct’ in the exemplar state

I this tests the assumption thatconsistency of applying thisstrategy increases withtraining

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Categorization learningBalance scale

Results (2)

I 4 states representingRule 1 (AABBB), Rule 2(BBABA), Exemplar(ABBBA), and guessing

I covariate of blocknumber on probability’correct’ in the exemplarstate (increasingconsistency)

Experiment 1

cond

ition

al p

roba

bilit

y

T1 T2 T4 T5 T6

0.0

0.2

0.4

0.6

0.8

1.0

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Categorization learningBalance scale

Results (3)

I Exemplar state isabsorbing

I Transitions occur mostlyfrom Res to R1 and R2and from rule states tothe Exemplar state

Individual differences in category learning 40

Table 4

Estimated transition matrix of LMM 4a of Experiment 1.

Transition probabilities

State E R1 R2 Res

E 1.00 0.00 0.00 0.00

R1 0.02 0.91 0.03 0.04

R2 0.11 0.00 0.89 0.00

Res 0.09 0.06 0.07 0.78

Note: E denotes exemplar-based generalization, R1 denotes rule 1, R2 denotes rule 2,

Res denotes residual state.

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Categorization learningBalance scale

Results (4)

I Posterior assignment ofresponses to states

I Response times for therule based versusexemplar basedstrategies over training

I Results indicate thatexemplar basedresponding is anexpression ofautomatization

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Categorization learningBalance scale

Results (5)

I Automatization shouldresult in a power law oflearning

I Mean and sd haveidentical coefficients

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Categorization learningBalance scale

Data

I 5 distance items on thebalance scale task

I age as covariateI items scored binary

Item sum scores

sum

Frequency

0 1 2 3 4 5

050

100

200

Data provided by BrendaJansen (Jansen & Van derMaas, 2002)

depmix

Hidden Markov ModelsDepmixS4Examples

Conclusions

Categorization learningBalance scale

Model3-class model with age as covariate on the class proportions

6 8 10 12 14 16 18

0.0

0.2

0.4

0.6

0.8

1.0

Class probabilities as function of age

age (years)

prob

abili

ty

incorrectguesscorrect

depmix