Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation...

70
Introduction Score-based methods Bayesian analysis Constraint-based methods Summary and challenges Things I did not even get near References Structure Estimation in Graphical Models Steffen Lauritzen, University of Oxford Wald Lecture, World Meeting on Probability and Statistics Istanbul 2012 Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Transcript of Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation...

Page 1: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure Estimation in Graphical Models

Steffen Lauritzen, University of Oxford

Wald Lecture, World Meeting on Probability and Statistics

Istanbul 2012

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 2: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

Advances in computing has set focus on estimation of structure:

I Model selection (e.g. subset selection in regression)

I System identification (engineering)

I Structural learning (AI or machine learning)

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 3: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

Advances in computing has set focus on estimation of structure:

I Model selection (e.g. subset selection in regression)

I System identification (engineering)

I Structural learning (AI or machine learning)

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 4: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

Advances in computing has set focus on estimation of structure:

I Model selection (e.g. subset selection in regression)

I System identification (engineering)

I Structural learning (AI or machine learning)

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 5: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

Advances in computing has set focus on estimation of structure:

I Model selection (e.g. subset selection in regression)

I System identification (engineering)

I Structural learning (AI or machine learning)

Graphical models describe conditional independence structures, sogood case for formal analysis.

Methods must scale well with data size, as many structures andhuge collections of data are to be considered.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 6: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

Why estimation of structure?

I Parallel to e.g. density estimation

I Obtain quick overview of relations between variables incomplex systems

I Data mining

I Gene regulatory networks

I Reconstructing family trees from DNA information

I General interest in sparsity.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 7: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

Why estimation of structure?

I Parallel to e.g. density estimation

I Obtain quick overview of relations between variables incomplex systems

I Data mining

I Gene regulatory networks

I Reconstructing family trees from DNA information

I General interest in sparsity.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 8: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

Why estimation of structure?

I Parallel to e.g. density estimation

I Obtain quick overview of relations between variables incomplex systems

I Data mining

I Gene regulatory networks

I Reconstructing family trees from DNA information

I General interest in sparsity.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 9: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

Why estimation of structure?

I Parallel to e.g. density estimation

I Obtain quick overview of relations between variables incomplex systems

I Data mining

I Gene regulatory networks

I Reconstructing family trees from DNA information

I General interest in sparsity.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 10: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

Why estimation of structure?

I Parallel to e.g. density estimation

I Obtain quick overview of relations between variables incomplex systems

I Data mining

I Gene regulatory networks

I Reconstructing family trees from DNA information

I General interest in sparsity.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 11: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

Why estimation of structure?

I Parallel to e.g. density estimation

I Obtain quick overview of relations between variables incomplex systems

I Data mining

I Gene regulatory networks

I Reconstructing family trees from DNA information

I General interest in sparsity.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 12: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

Markov mesh model

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 13: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

PC algorithm

Crudest algorithm (HUGIN), 10000 simulated cases

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 14: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

Bayesian GES

Crudest algorithm (WinMine), 10000 simulated cases

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 15: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

Tree model

PC algorithm, 10000 cases, correct reconstruction

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 16: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

Bayesian GES on tree

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 17: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

Chest clinic

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 18: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

PC algorithm

10000 simulated cases

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 19: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

Bayesian GES

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 20: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

SNPs and gene expressions

●●

●●

●●

●●●

●●

● ● ● ●

●●●

●●●●

● ●●

●●

●●

●●

●●●

●●●

●●

●●

● ●●

●●

●●

●●

● ●

● ●

●●

● ●

●●

●●●

●●

●●

●●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●● ●

●● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

1

2

3

45

6

7

8

9

1011

12

13

14

15

16

17

18

1920

21

2223 24

25 26

2728

29

30

31

32

3334

3536

37

38

39 40

41

4243

44

45

46

4748

49

50

51

52

53

54

55

5657

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

848586

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101102

103

104

105

106

107

108

109 110

111

112

113 114

115116

117

118

119

120 121

122

123

124

125

126127

128

129130

131

132133

134

135

136

137

138

139

140

141

142143

144

145

146147

148

149

150

151

152

153

154

155

156

157

158

159

160161

162

163

164

165

166

167

168169

170

171

172

173

174

175

176

177

178

179

180

181182

183

184

185

186

187

188

189

190

191

192

193194

195

196

197

198 199

200

201

202

203

204

205

206

207

208

209210 211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253254

255

256

257

258

259260

261

262

263

264

265

266

267

268

269

270

271

272

273 274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292293

294

295296

297

298

299

300301

302

303

304

305

306

307

308

309

310

311312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361 362

363

364

365

366367

368369

370

371

372

373374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

min BIC forest

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 21: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

Methods for structure identification in graphical models can beclassified into three types:

I score-based methods: For example optimizing a penalizedlikelihood by using convex programming e.g. glasso;

I Bayesian methods: Identifying posterior distributions overgraphs; can also use posterior probability as score.

I constraint-based methods: Querying conditionalindependences and identifying compatible independencestructures, for example PC, PC*, NPC, IC, CI, FCI, SIN, QP,. . .

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 22: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

Methods for structure identification in graphical models can beclassified into three types:

I score-based methods: For example optimizing a penalizedlikelihood by using convex programming e.g. glasso;

I Bayesian methods: Identifying posterior distributions overgraphs; can also use posterior probability as score.

I constraint-based methods: Querying conditionalindependences and identifying compatible independencestructures, for example PC, PC*, NPC, IC, CI, FCI, SIN, QP,. . .

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 23: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Structure estimationSome examplesGeneral points

Methods for structure identification in graphical models can beclassified into three types:

I score-based methods: For example optimizing a penalizedlikelihood by using convex programming e.g. glasso;

I Bayesian methods: Identifying posterior distributions overgraphs; can also use posterior probability as score.

I constraint-based methods: Querying conditionalindependences and identifying compatible independencestructures, for example PC, PC*, NPC, IC, CI, FCI, SIN, QP,. . .

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 24: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Estimating trees and forestsGraphical lassoScore-based methods for DAGs

Assume P factorizes w.r.t. an unknown tree τ . Based on a samplex (n) = (x1, . . . , xn) the likelihood function becomes

`(τ, p) = log L(τ) = log p(x (n) | τ)

=∑

e∈E(τ)

log p(x(n)e )−

∑v∈V{deg(v)− 1} log p(x

(n)v ).

Maximizing this over p for a fixed tree τ yields the profile likelihood

l(τ) = l(τ, p) =∑

e∈E(τ)

Hn(e) +∑v∈V

Hn(v)

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 25: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Estimating trees and forestsGraphical lassoScore-based methods for DAGs

Here Hn(e) is the empirical cross-entropy or mutual informationbetween endpoint variables of the edge e = {u, v}:

Hn(e) =∑ n(xu, xv )

nlog

n(xu, xv )/n

n(xu)n(xv )/n2

and similarly Hn(v) is the empirical entropy of Xv .

Based on this fact, (Chow and Liu, 1968) showed MLE τ of T hasmaximal weight, where the weight of τ is

w(τ) =∑

e∈E(τ)

Hn(e) = l(τ)− l(φ0).

Here φ0 is the graph with all vertices isolated.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 26: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Estimating trees and forestsGraphical lassoScore-based methods for DAGs

Fast algorithms (Kruskal Jr., 1956) compute maximal weightspanning tree from weights W = (wuv , u, v ∈ V ).

Chow and Wagner (1978) show a.s. consistency in total variationof P: If P factorises w.r.t. τ , then

supx|p(x)− p(x)| → 0 for n→∞,

so if τ is unique for P, τ = τ for all n > N for some N.

If P does not factorize w.r.t. a tree, P converges to closesttree-approximation P to P (Kullback-Leibler distance).

Note that if P is Markov w.r.t. and undirected graph G and wedefine weights on edges as w(e) = H(e) by cross-entropy, P isMarkov w.r.t any maximal weight spanning tree of G.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 27: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Estimating trees and forestsGraphical lassoScore-based methods for DAGs

Forests

Note that if we consider forests instead of trees, i.e. allow missingbranches, it is still true that

l(φ) = l(φ, p) =∑

e∈E(φ)

Hn(e) +∑v∈V

Hn(v).

Thus if we add a penalty for each edge, e.g. proportional to thenumber of additional parameters qe of introducing the edge e, wecan find the maximum penalised forest by Kruskal’s algorithmusing weights

w(φ) =∑

e∈E(φ)

{Hn(e)− λqe} .

This has been exploited in the package gRapHD (Edwards et al.,2010), see also Panayidou (2011) and Højsgaard et al. (2012).

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 28: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Estimating trees and forestsGraphical lassoScore-based methods for DAGs

SNPs and gene expressions

●●

●●

●●

●●●

●●

● ● ● ●

●●●

●●●●

● ●●

●●

●●

●●

●●●

●●●

●●

●●

● ●●

●●

●●

●●

● ●

● ●

●●

● ●

●●

●●●

●●

●●

●●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●● ●

●● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

1

2

3

45

6

7

8

9

1011

12

13

14

15

16

17

18

1920

21

2223 24

25 26

2728

29

30

31

32

3334

3536

37

38

39 40

41

4243

44

45

46

4748

49

50

51

52

53

54

55

5657

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

848586

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101102

103

104

105

106

107

108

109 110

111

112

113 114

115116

117

118

119

120 121

122

123

124

125

126127

128

129130

131

132133

134

135

136

137

138

139

140

141

142143

144

145

146147

148

149

150

151

152

153

154

155

156

157

158

159

160161

162

163

164

165

166

167

168169

170

171

172

173

174

175

176

177

178

179

180

181182

183

184

185

186

187

188

189

190

191

192

193194

195

196

197

198 199

200

201

202

203

204

205

206

207

208

209210 211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253254

255

256

257

258

259260

261

262

263

264

265

266

267

268

269

270

271

272

273 274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292293

294

295296

297

298

299

300301

302

303

304

305

306

307

308

309

310

311312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361 362

363

364

365

366367

368369

370

371

372

373374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

min BIC forest

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 29: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Estimating trees and forestsGraphical lassoScore-based methods for DAGs

Gaussian Trees

If X = (Xv , v ∈ V ) is regular multivariate Gaussian, it factorizesw.r.t. an undirected graph if and only if its concentration matrixK = Σ−1 satisfies

kuv = 0 ⇐⇒ u 6∼ v .

Results of Chow et al. are easily extended to Gaussian trees, withthe weight of a tree determined as

w(τ) =∑

e∈E(τ)

− log(1− r2e ),

with r2e being the empirical correlation coeffient between Xu and

Xv .

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 30: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Estimating trees and forestsGraphical lassoScore-based methods for DAGs

I Direct likelihood methods (ignoring penalty terms) lead tosensible results.

I (Boootstrap) sampling distribution of tree MLE can besimulated

I Penalty terms additive along branches, so highest AIC or BICscoring tree (forest) also available using a MWST algorithm.

I Tree methods scale extremely well with both sample size andnumber of variables;

I Pairwise marginal counts are sufficient statistics for the treeproblem (empirical covariance matrix in the Gaussian case).

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 31: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Estimating trees and forestsGraphical lassoScore-based methods for DAGs

I Direct likelihood methods (ignoring penalty terms) lead tosensible results.

I (Boootstrap) sampling distribution of tree MLE can besimulated

I Penalty terms additive along branches, so highest AIC or BICscoring tree (forest) also available using a MWST algorithm.

I Tree methods scale extremely well with both sample size andnumber of variables;

I Pairwise marginal counts are sufficient statistics for the treeproblem (empirical covariance matrix in the Gaussian case).

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 32: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Estimating trees and forestsGraphical lassoScore-based methods for DAGs

I Direct likelihood methods (ignoring penalty terms) lead tosensible results.

I (Boootstrap) sampling distribution of tree MLE can besimulated

I Penalty terms additive along branches, so highest AIC or BICscoring tree (forest) also available using a MWST algorithm.

I Tree methods scale extremely well with both sample size andnumber of variables;

I Pairwise marginal counts are sufficient statistics for the treeproblem (empirical covariance matrix in the Gaussian case).

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 33: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Estimating trees and forestsGraphical lassoScore-based methods for DAGs

I Direct likelihood methods (ignoring penalty terms) lead tosensible results.

I (Boootstrap) sampling distribution of tree MLE can besimulated

I Penalty terms additive along branches, so highest AIC or BICscoring tree (forest) also available using a MWST algorithm.

I Tree methods scale extremely well with both sample size andnumber of variables;

I Pairwise marginal counts are sufficient statistics for the treeproblem (empirical covariance matrix in the Gaussian case).

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 34: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Estimating trees and forestsGraphical lassoScore-based methods for DAGs

I Direct likelihood methods (ignoring penalty terms) lead tosensible results.

I (Boootstrap) sampling distribution of tree MLE can besimulated

I Penalty terms additive along branches, so highest AIC or BICscoring tree (forest) also available using a MWST algorithm.

I Tree methods scale extremely well with both sample size andnumber of variables;

I Pairwise marginal counts are sufficient statistics for the treeproblem (empirical covariance matrix in the Gaussian case).

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 35: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Estimating trees and forestsGraphical lassoScore-based methods for DAGs

I Direct likelihood methods (ignoring penalty terms) lead tosensible results.

I (Boootstrap) sampling distribution of tree MLE can besimulated

I Penalty terms additive along branches, so highest AIC or BICscoring tree (forest) also available using a MWST algorithm.

I Tree methods scale extremely well with both sample size andnumber of variables;

I Pairwise marginal counts are sufficient statistics for the treeproblem (empirical covariance matrix in the Gaussian case).

Note sufficiency holds despite parameter space very different fromopen subset of Rk .

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 36: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Estimating trees and forestsGraphical lassoScore-based methods for DAGs

Consider an undirected Gaussian graphical model and thel1-penalized log-likelihood function

2`pen(K ) = log detK − tr(KW )− λ||K ||∗1.

The penalty ||K ||∗1 =∑

i 6=j |kij | is essentially a convex relaxation ofthe number of edges in the graph and optimization of thepenalized likelihood will typically lead to several kij = 0 and thus ineffect estimate a particular graph.

This penalized likelihood can be maximized efficiently (Banerjeeet al., 2008) as implemented in the graphical lasso (Friedmanet al., 2008).

Beware: not scale-invariant!

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 37: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Estimating trees and forestsGraphical lassoScore-based methods for DAGs

glasso for bodyfat

●Weight

●Knee

●Thigh

●Hip

●Abdomen

●Ankle ●Biceps

●Forearm

●Height

●Wrist

●Neck

●Chest

●Age

●BodyFat

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 38: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Estimating trees and forestsGraphical lassoScore-based methods for DAGs

Markov equivalence

D and D′ are equivalent if and only if:

1. D and D′ have same skeleton (ignoring directions)

2. D and D′ have same unmarried parents

so

s - ss@@R? s ≡ s - s s

s?@@I

but

s - s -

s@@R? s 6≡ s - s - s

s6@

@R

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 39: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Estimating trees and forestsGraphical lassoScore-based methods for DAGs

Equivalence class searches

Searches directly in equivalence classes of DAGs.

Define score function σ(D), with the property that

D ≡ D′ ⇒ σ(D) = σ(D′).

This holds e.g. if score function is AIC or BIC or full Bayesianposterior with strong hyper Markov prior (based upon Dirichlet orinverse Wishart distributions).

Equivalence class with maximal score is sought.

dlasso? problems with invariance over equivalence classes!

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 40: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Estimating trees and forestsGraphical lassoScore-based methods for DAGs

Greedy equivalence class search

1. Initialize with empty DAG

2. Repeatedly search among equivalence classes with a singleadditional edge and go to class with highest score - until noimprovement.

3. Repeatedly search among equivalence classes with a singleedge less and move to one with highest score - until noimprovement.

For BIC, for example, this algorithm yields consistent estimate ofequivalence class for P. (Chickering, 2002)

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 41: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Basic Bayesian model estimationDecomposable graphical modelsBayesian methods for DAGs

For g in specified set of graphs, Θg is associated parameter spaceso that P factorizes w.r.t. g if and only if P = Pθ for some θ ∈ Θg .

πg is prior on Θg . Prior p(g) is uniform for simplicity.

Based on x (n), posterior distribution of G is

p∗(g) = p(g | x (n)) ∝ p(x (n) | g) =

∫Θg

p(x (n) | g , θ)πg (dθ).

Bayesian analysis looks for MAP estimate g∗ maximizing p∗(g) orattempts to sample from posterior, using e.g. MCMC.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 42: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Basic Bayesian model estimationDecomposable graphical modelsBayesian methods for DAGs

For connected decomposable graphs and strong hyper Markovpriors Dawid and Lauritzen (1993) show

p(x (n) | g) =

∏C∈C p(x

(n)C )∏

S∈S p(x(n)S )

,

where each factor has explicit form. C are the cliques of g and Sthe separators (mininal cutsets).

Hence, if the prior distributions over a class of graphs is uniform,the posterior distribution has the form

p(g | x (n)) ∝∏

C∈C p(x(n)C )∏

S∈S p(x(n)S )

=

∏C∈C w(C )∏S∈S w(S)

.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 43: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Basic Bayesian model estimationDecomposable graphical modelsBayesian methods for DAGs

Byrne (2011) shows that a distribution over decomposable graphshaving the form

p(g) ∝∏

C∈C w(C )∏S∈S w(S)

,

satisfies a structural Markov property so that for subsets A and Bwith V = A ∪ B, it holds that

GA⊥⊥GB | (A,B) is a decomposition of G.

In words, the finer structure of the graph bits in two componentsof any decomposition are independent.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 44: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Basic Bayesian model estimationDecomposable graphical modelsBayesian methods for DAGs

Trees are decomposable, so for trees we get

p(τ | x (n)) ∝∏

e∈E(τ)

p(x(n)e )

which more illuminating can be expressed as

p(τ | x (n)) ∝∏

e∈E(τ)

Be

where

Buv =p(x

(n)uv )

p(x(n)u )p(x

(n)v )

is the Bayes factor for independence along the edge uv .

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 45: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Basic Bayesian model estimationDecomposable graphical modelsBayesian methods for DAGs

MAP estimates of trees can be computed (also in Gaussian case)

Good direct algorithms exist for generating random spanning trees(Guenoche, 1983; Broder, 1989; Aldous, 1990; Propp and Wilson,1998) so full posterior analysis is possible for trees.

MCMC methods for exploring posteriors of undirected graphs havebeen developed.

Even for forests, it seems complicated to sample from the posteriordistribution (Dai, 2008).

p(φ | x (n)) ∝∏

e∈E(φ)

Be

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 46: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Basic Bayesian model estimationDecomposable graphical modelsBayesian methods for DAGs

Some challenges for undirected graphs

I Find feasible algorithm for (perfect) simulation from adistribution over decomposable graphs as

p(g) ∝∏

C∈C w(C )∏S∈S w(S)

,

where w(A),A ⊆ V are a prescribed set of positive weights.

I Find feasible algorithm for obtaining MAP in decomposablecase. This may not be universally possible as problem isNP-complete, even for bounded maximum clique size.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 47: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Basic Bayesian model estimationDecomposable graphical modelsBayesian methods for DAGs

Posterior distribution for DAG

For strong directed hyper Markov priors it holds that

p(x (n) | d) =∏v∈V

p(x(n)v | x (n)

pa(v))

sop(d | x (n)) ∝

∏v∈V

p(x(n)v | x (n)

pa(v)),

see e.g. Spiegelhalter and Lauritzen (1990); Cooper and Herskovits(1992); Heckerman et al. (1995)Challenge: Find good algorithm for sampling from this fullposterior.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 48: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

PC algorithm

First step of constraint-based methods (eg PC-algorithm) is toidentify skeleton of G, which is the undirected graph with α 6∼ β ifand only if there exists S ⊆ V \ {α, β} with α⊥g β |S .

The skeleton stage of any such algorithm is thus correct for anytype of graph which represents the independence structure!

In particular, the the PC algorithm can easily be adapted to UGsand CHGs (Meek, 1996).

Challenge: the properties of these algorithms when translated tosampling situations are not sufficiently well understood.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 49: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

PC algorithm

Step 1: Identify skeleton using that, for P faithful,

u 6∼ v ⇐⇒ ∃S ⊆ V \ {u, v} : Xu ⊥⊥Xv | XS .

Begin with complete graph, check for S = ∅ andremove edges when independence holds. Thencontinue for increasing |S |.PC-algorithm (Spirtes et al., 1993) exploits that onlyS with S ⊆ ne(u) or S ⊆ ne(v) needs checking, nerefers to current skeleton.

Step 2: Identify directions to be consistent withindependence relations found in Step 1.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 50: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

PC algorithm

pcalg for bodyfat

BodyFat Age

Weight

Height

Neck

Chest

Abdomen

Hip

Thigh

Knee

Ankle

Biceps

Forearm

Wrist

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 51: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

PC algorithm

pcalg for carcass

Fat11

Meat11

Fat12

Meat12

Fat13

Meat13

LeanMeat

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 52: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

PC algorithm

Faithfulness

A given distribution P is in general compatible with a variety ofstructures, i.e. if P corresponds to complete independence. Toidentify a structure G something like the following must hold

P is said to be faithful to G if

A⊥⊥B | XS ⇐⇒ A⊥g B | S .

Most distributions are faithful. More precisely, for DAGs it holdsthat the non-faithful distributions form a Lebesgue null-set inparameter space associated with a DAG.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 53: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

PC algorithm

Exact properties of PC-algorithm

If P is faithful to DAG D, PC-algorithm finds D′ equivalent to D.It uses N independence checks where N is at most

N ≤ 2

(|V |2

) d∑i=0

(|V | − 1

i

)≤ |V |

d+1

(d − 1)!,

where d is the maximal degree of any vertex in D. f So worst casecomplexity is exponential, but algorithm fast for sparse graphs.Sampling properties are less well understood although consistencyresults exist.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 54: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

PC algorithm

Constraint based methods establish fundamentally two lists:

1. An independence list I of triplets (α, β, S) with α⊥⊥β |S ;identifies skeleton;

2. A dependence list D of triplets (α, β, S) with ¬(α⊥⊥β | S).

They get established recursively, for increasing size of S , and thelist I is most likely to have errors. The lists may or may not beconsistent with a DAG model, but methods exist for checking this.Question: Assume a current pair of input lists is consistent withcompositional graphoid axioms and a new triplet (α, β, S) isconsidered. Can it be verified whether it can be consistently addedto any of the two lists? Graph representable? Compositional?

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 55: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

I Seems out of hand to extend Bayesian and score basedmethods to more general graphical models;

I Fully Bayesian methods do typically not scale well withnumber of variables;

I Constraint based methods have less clear formal inferencebasis; challenge to improve this.

I Constraint based methods have been developed to work incases where P is faithful and conditional independence queriescan be resolved without error.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 56: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

I Seems out of hand to extend Bayesian and score basedmethods to more general graphical models;

I Fully Bayesian methods do typically not scale well withnumber of variables;

I Constraint based methods have less clear formal inferencebasis; challenge to improve this.

I Constraint based methods have been developed to work incases where P is faithful and conditional independence queriescan be resolved without error.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 57: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

I Seems out of hand to extend Bayesian and score basedmethods to more general graphical models;

I Fully Bayesian methods do typically not scale well withnumber of variables;

I Constraint based methods have less clear formal inferencebasis; challenge to improve this.

I Constraint based methods have been developed to work incases where P is faithful and conditional independence queriescan be resolved without error.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 58: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

I Seems out of hand to extend Bayesian and score basedmethods to more general graphical models;

I Fully Bayesian methods do typically not scale well withnumber of variables;

I Constraint based methods have less clear formal inferencebasis; challenge to improve this.

I Constraint based methods have been developed to work incases where P is faithful and conditional independence queriescan be resolved without error.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 59: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

I Factorisation properties for Markov distributions;

I Local computation algorithms for speeding up any relevantcomputation;

I Causality and causal interpretations;

I Non-parametric graphical models for large scale data analysis;

I Probabilistic expert systems, for example in forensicidentification problems;

I Markov theory for infinite graphs (huge graphs).

I THANK YOU FOR LISTENING!

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 60: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

I Factorisation properties for Markov distributions;

I Local computation algorithms for speeding up any relevantcomputation;

I Causality and causal interpretations;

I Non-parametric graphical models for large scale data analysis;

I Probabilistic expert systems, for example in forensicidentification problems;

I Markov theory for infinite graphs (huge graphs).

I THANK YOU FOR LISTENING!

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 61: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

I Factorisation properties for Markov distributions;

I Local computation algorithms for speeding up any relevantcomputation;

I Causality and causal interpretations;

I Non-parametric graphical models for large scale data analysis;

I Probabilistic expert systems, for example in forensicidentification problems;

I Markov theory for infinite graphs (huge graphs).

I THANK YOU FOR LISTENING!

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 62: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

I Factorisation properties for Markov distributions;

I Local computation algorithms for speeding up any relevantcomputation;

I Causality and causal interpretations;

I Non-parametric graphical models for large scale data analysis;

I Probabilistic expert systems, for example in forensicidentification problems;

I Markov theory for infinite graphs (huge graphs).

I THANK YOU FOR LISTENING!

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 63: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

I Factorisation properties for Markov distributions;

I Local computation algorithms for speeding up any relevantcomputation;

I Causality and causal interpretations;

I Non-parametric graphical models for large scale data analysis;

I Probabilistic expert systems, for example in forensicidentification problems;

I Markov theory for infinite graphs (huge graphs).

I THANK YOU FOR LISTENING!

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 64: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

I Factorisation properties for Markov distributions;

I Local computation algorithms for speeding up any relevantcomputation;

I Causality and causal interpretations;

I Non-parametric graphical models for large scale data analysis;

I Probabilistic expert systems, for example in forensicidentification problems;

I Markov theory for infinite graphs (huge graphs).

I THANK YOU FOR LISTENING!

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 65: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

I Factorisation properties for Markov distributions;

I Local computation algorithms for speeding up any relevantcomputation;

I Causality and causal interpretations;

I Non-parametric graphical models for large scale data analysis;

I Probabilistic expert systems, for example in forensicidentification problems;

I Markov theory for infinite graphs (huge graphs).

I THANK YOU FOR LISTENING!

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 66: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Aldous, D. (1990). A random walk construction of uniformspanning trees and uniform labelled trees. SIAM Journal onDiscrete Mathematics, 3(4):450–465.

Banerjee, O., Ghaoui, L. E., and dAspremont, A. (2008). Modelselection through sparse maximum likelihood estimation formultivariate Gaussian or binary data. 9:485–216.

Broder, A. (1989). Generating random spanning trees. In ThirtiethAnnual Symposium of Computer Science, pages 442–447.

Byrne, S. (2011). Hyper and Structural Markov Laws for GraphicalModels. PhD thesis, Statistical Laboratory, University ofCambridge.

Chickering, D. M. (2002). Optimal structure identification withgreedy search. Journal of Machine Learning Research,3:507–554.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 67: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Chow, C. K. and Liu, C. N. (1968). Approximating discreteprobability distributions with dependence trees. IEEETransactions on Information Theory, 14:462–467.

Chow, C. K. and Wagner, T. J. (1978). Consistency of an estimateof tree-dependent probability distributions. IEEE Transactionson Information Theory, 19:369–371.

Cooper, G. F. and Herskovits, E. (1992). A Bayesian method forthe induction of probabilistic networks from data. MachineLearning, 9:309–347.

Dai, H. (2008). Perfect sampling methods for random forests.Advances in Applied Probability, 40:897–917.

Dawid, A. P. and Lauritzen, S. L. (1993). Hyper Markov laws inthe statistical analysis of decomposable graphical models. TheAnnals of Statistics, 21:1272–1317.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 68: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Edwards, D., de Abreu, G., and Labouriau, R. (2010). Selectinghigh-dimensional mixed graphical models using minimal AIC orBIC forests. BMC Bioinformatics, 11(1):18.

Friedman, J., Hastie, T., and Tibshirani, R. (2008). Sparse inversecovariance estimation with the graphical lasso. Biostatistics,9(3):432–441.

Guenoche, A. (1983). Random spanning tree. Journal ofAlgorithms, 4(3):214 –220.

Heckerman, D., Geiger, D., and Chickering, D. M. (1995).Learning Bayesian networks: The combination of knowledge andstatistical data. Machine Learning, 20:197–243.

Højsgaard, S., Edwards, D., and Lauritzen, S. (2012). GraphicalModels with R. Springer-Verlag, New York.

Kruskal Jr., J. B. (1956). On the shortest spanning subtree of aSteffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 69: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

graph and the travelling salesman problem. Proceedings of theAmerican Mathematical Society, 7:48–50.

Meek, C. (1996). Selecting Graphical Models: Causal andStatistical Reasoning. PhD thesis, Carnegie–Mellon University.

Panayidou, K. (2011). Estimation of Tree Structure for VariableSelection. PhD thesis, Department of Statistics, University ofOxford.

Propp, J. G. and Wilson, D. B. (1998). How to get a perfectlyrandom sample from a generic Markov chain and generate arandom spanning tree of a directed graph. Journal ofAlgorithms, 27(2):170 –217.

Spiegelhalter, D. J. and Lauritzen, S. L. (1990). Sequentialupdating of conditional probabilities on directed graphicalstructures. Networks, 20:579–605.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models

Page 70: Structure Estimation in Graphical Modelssteffen/seminars/waldstructure.pdf · Structure estimation Some examples General points Advances in computing has set focus on estimation of

IntroductionScore-based methods

Bayesian analysisConstraint-based methods

Summary and challengesThings I did not even get near

References

Spirtes, P., Glymour, C., and Scheines, R. (1993). Causation,Prediction and Search. Springer-Verlag, New York. Reprinted byMIT Press.

Steffen Lauritzen, University of Oxford Structure Estimation in Graphical Models