1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis...

19
1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium http://faculty.washington.edu/kenrice

Transcript of 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis...

Page 1: 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium .

1

Tales (and heads) of statistics in large genetic studies

Ken Rice

Associate Professor

Analysis Committee Chair, CHARGE consortium

http://faculty.washington.edu/kenrice

Page 2: 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium .

Q. What do you do?

Like most faculty, my time is split;

• Teaching courses• Advising students (Training Grant)• Developing new statistical methods• … and Cardiovascular disease research

2

Page 3: 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium .

3

Page 4: 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium .

Q. What do you do in GWAS?

p < 5x10-8?

Y

G

Basically, it’s embarrassingly simple…

Page 5: 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium .

Q. What does 5x10-8 mean?

5x10-8 is 0.00000005; a 1-in-20-million chance, or a 5-millionths of 1 percent. Which of these are more/less likely?

A. You are struck by lightning, this year

B. Your 1 ticket wins WA’s Lottery Jackpot

C. You (born today) live to 110 years old

5

1 in 7 million

1 in a m

illion

1 in 250 million

Page 6: 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium .

Q. What’s it mean that’s familiar?

• Someone is tossing coins; who?

6Nice, ineffectual Causes deaths!

Page 7: 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium .

Q. Dudley D-R or Snidely W?

Suppose we see;• 2 heads in a row;

p=1/4• 3 heads in a row;

p=1/8

7Neither of these would be very suspicious

Page 8: 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium .

Q. Dudley or Snidely, in a GWAS?

How many heads in a row gives p<5x10-8 ?

8

p=

• In GWAS, seeing ‘only’ 24 heads in a row isn’t enough to make us suspicious (!)

Page 9: 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium .

Q. Dudley or Snidely? (harder)

Suppose, unknown to us and the coin-tosser, the coin was a little biased?

9

Heads comes up more often than usual; we’d be suspicious too soon

Page 10: 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium .

Q. Dudley or Snidely? (harder)

How much does it matter? If the coin actually has a 55% chance of heads;• 3 heads in a row;

=16.6%

• but we’d think; = 12.5%

We’d be 1.33 too suspicious – about the same as extra 4/10 Heads, from a fair coin

10

Page 11: 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium .

Q. How does this affect GWAS?

11

We thought 3+0.4 heads, not 3

We’d think 29.9 heads, not 25 (!)

We’d think 26.7 heads, not 23 (!!!)

Page 12: 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium .

Q. How does this affect GWAS?

Inflation exactly like this happens in GWAS;• If many tests are only slightly ‘wrong’, there

will be many spurious signals• E.g. some variants

are more common in Scots…

• We can fix it, by ‘angling down’ the line so it behaves correctly at p=0.5 (i.e. at 1 head)

12

Page 13: 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium .

Q. Is that the only problem? (no)

Back to our cartoon – and a fair coin;

13

Computers work out p;…actually, they* just work out the approximate value of p

*…even the cool stylish ones

Page 14: 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium .

Q. What’s the right answer?

14

132

132

532

1032

1032

532 = 0.031

Page 15: 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium .

Q. What’s the approximate answer?

15

p = 0.031Area = 0.033

(i.e. 4.9 heads)

Page 16: 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium .

Q. What happens in GWAS?

16

For 25/25 heads; p = 3x10-8

Area = ???

Page 17: 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium .

Q. What happens in GWAS?

17

At 25/25 heads; p = 3x10-8

Area = 1.3x10-12

i.e. 39.5 heads (!)

Claiming 25 H’s worth of suspicion when should claim 18 (!!!)

No problem, at 5 H’s

Page 18: 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium .

Q. How does this affect GWAS?

Inflation exactly like this happens in GWAS;• The data is fine, but the approximate

calculations are too approximate• The ‘angling down’ fix doesn’t work, here• In GWAS we can’t do perfect calculations

– but are now using better approximations• More accurate results & better science

18

Page 19: 1 Tales (and heads) of statistics in large genetic studies Ken Rice Associate Professor Analysis Committee Chair, CHARGE consortium .

19

Q. Are you going to stop now?

In summary; • “Omics” data a huge statistical

challenge… even to do familiar stuff• We want people who are;

– Smart – Inquisitive about statistics– Care about doing good science