Regression-based linkage analysis

Regression-based linkage Regression-based linkage analysisanalysis

Shaun Purcell, Pak Sham.

• Penrose (1938)• quantitative trait locus linkage for sib pair data

• Simple regression-based method• squared pair trait difference

• proportion of alleles shared identical by descent

(HE-SD)(X – Y)2 = 2(1 – r) – 2Q( – 0.5) + ^

Haseman-Elston regressionHaseman-Elston regression

IBD

(X - Y)2

= -2Q

210

Expected sibpair allele Expected sibpair allele sharingsharing

+++ + + +---- -- Sib 2

Sib 1

Squared differences (SD)Squared differences (SD)

+++ + + +---- -- Sib 2

Sib 1

Sums versus differencesSums versus differences

• Wright (1997), Drigalenko (1998)

• phenotypic difference discards sib-pair QTL linkage information

• squared pair trait sum provides extra information for linkage

• independent of information from HE-SD

(HE-SS)(X + Y)2 = 2(1 + r) + 2Q( – 0.5) + ^

Squared sums (SS)Squared sums (SS)

+++ + + +---- -- Sib 2

Sib 1

SD and SSSD and SS

+++ + + +---- -- Sib 2

Sib 1

• New dependent variable to increase power• mean corrected cross-product (HE-CP)

XY 2241 )()( YXYX

• other extensions• > 2 sibs in a sibship multiple trait loci and epistasis

• multivariate multiple markers

• binary traits other relative classes

SD + SS ( = CP)SD + SS ( = CP)

+++ + + +---- -- Sib 2

Sib 1

Xu et al Xu et al

• With residual sibling correlation• HE-CP in power, HE-SD in power

2

ˆˆˆ SDSS

• HE-CP

SDSS ww ˆ)1(ˆˆ • Propose a weighting scheme

Variance of SDVariance of SD

Variance of SSVariance of SS

Low sibling correlationLow sibling correlation

Increased sibling correlationIncreased sibling correlation

• Clarify the relative efficiencies of existing HE methods

• Demonstrate equivalence between a new HE method and variance components methods

• Show application to the selection and analysis of extreme, selected samples

Haseman-Elston regressionsHaseman-Elston regressions

HE-SD(X – Y)2 = 2(1 – r) – 2Q( – 0.5) +

HE-SS(X + Y)2 = 2(1 + r) + 2Q( – 0.5) +

HE-CPXY = r + Q( – 0.5) +

NCPs for H-E regressionsNCPs for H-E regressions

(X – Y)2

(X + Y)2

XY

Dependent

8(1 – r )2

8(1 + r )2

1 + r2

Variance ofDependent

2

2

)1(2

)ˆ(

r

VarQ

2

2

)1(2

)ˆ(

r

VarQ

2

2

1

)ˆ(

r

VarQ

NCP persibpair

Weighted H-EWeighted H-E

22

22

)1(1

)1(1

)1(1

)1(1 ˆˆ

ˆ

rr

SDrSSrQQ

Q

• Squared-sums and squared-differences • orthogonal components in the population

• Optimal weighting• inverse of their variances

Weighted H-EWeighted H-E

22

22

)1(

)1()ˆ(

r

rVarQNCP

• A function of• square of QTL variance

• marker informativeness

• complete information = 0.0125

• sibling correlation

• Equivalent to variance components• to second-order approximation

• Rijsdijk et al (2000)

Combining into one Combining into one regressionregression

• New dependent variable :• a linear combination of

• squared-sum

• and squared-difference

• weighted by the population sibling correlation:

)5.0ˆ()1(

)1(4

1

2

)1(

)(

)1(

)(22

2

22

2

2

2

Qr

r

r

r

r

YX

r

YX

)5.0ˆ()1(

)1(4

1

2

)1(

)(

)1(

)(22

2

22

2

2

2

Qr

r

r

r

r

YX

r

YX21

4

r

r

HE-COMHE-COM

+++ +---- -- Sib 2

Sib 1

SimulationSimulation

• Single QTL simulated• accounts for 10% of trait variance

• 2 equifrequent alleles; additive gene action

• assume complete IBD information at QTL

• Residual variance• shared and nonshared components

• residual sibling correlation : 0 to 0.5

• 10,000 sibling pairs • 100 replicates

• 1000 under the null

Unselected samplesUnselected samples

0

2

4

6

8

10

0 0.1 0.2 0.3 0.4 0.5

Residual sib correlation

Ave

rag

e ch

i-sq

uar

ed

HE-SD

HE-SSHE-CP

HE-COMVC

Sample selectionSample selection

• A sib-pairs’ squared mean-corrected DV is proportional to its expected NCP

• Equivalent to variance-components based selection scheme• Purcell et al, (2000)

2

22

2

2

2

1

2

)1(

)(

)1(

)()|(

r

r

r

YX

r

YXtraitNCPE 21

4

r

r

Sample selectionSample selection

-4 -3 -2 -1 0 1 2 3 4Sib 1 trait -4

-3-2

-10

12

34

Sib 2 trait

00.20.40.60.8

11.21.41.6

Sibship NCP

• 500 (5%) most informative pairs selected

r = 0.05 r = 0.60

Analysis of selected samplesAnalysis of selected samples

Selected samples : HSelected samples : H00

0

0.5

1

1.5

2

0 0.1 0.2 0.3 0.4 0.5


Av

era

ge

ch

i-s

qu

are

d

VC

VC-C

HE-COM

Selected samples : HSelected samples : HAA

0

2

4

6

8

10

0 0.1 0.2 0.3 0.4 0.5


Av

era

ge

ch

i-s

qu

are

d

VC-C

HE-COM

• Variance-based weighting scheme • SD and SS weighted in proportion to the inverse of

their variances

• Implemented as an iterative estimation procedure• loses simple regression-based framework

• Product of pair values corrected for the family mean• for sibs 1 and 2 from the j th family,

• Adjustment for high shared residual variance

• For pairs, reduces to HE-SD

))(( 21 jjjj XX

ConclusionsConclusions

• Advantages• Efficient

• Robust

• Easy to implement

• Future directions• Weight by marker informativeness

• Extension to general pedigrees

The EndThe End

Regression-based linkage analysis

Documents

Transcript of Regression-based linkage analysis