Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

26
Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem Authors: Lan Liu, Xi Chen, Jing Xiao & Tao Jiang

description

Authors: Lan Liu , Xi Chen, Jing Xiao. & Tao Jiang. Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem. Outline. Introduction and problem definition Deciding the complexity of binary-tree-MRHC Approximation of MRHC with missing data - PowerPoint PPT Presentation

Transcript of Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Page 1: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Authors: Lan Liu, Xi Chen, Jing Xiao & Tao Jiang

Page 2: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Outline Introduction and problem definition Deciding the complexity of binary-tree-MRHC Approximation of MRHC with missing data Approximation of MRHC without missing data Approximation of bounded MRHC Conclusion

Page 3: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Introduction

2 2

2 11 2

1 11 2

Genotype

Haplotype

Locus

2 1 PS value=11 2 PS value=0

Basic concepts Mendelian Law: one haplotype comes from the mother and the other comes from the father.

Example: Mendelian experiment

Page 4: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Notations and Recombinant

1111

22222222

2222

11110 recombinant

2222

FatherMother

: recombinant

1111

22222222

2222

1122

22221 recombinant

FatherMother

1122 2222 Genotype

1222

2122 Haplotype Configuration

Page 5: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Pedigree

Camilla, Duchess of Cornwall

Peter Phillips Zara Phillips

Diana,Princess of Wales

Prince Williamof Wales

Prince Henry ofWales

PrincessBeatrice of York

PrincessEugenie of York

Lady LouiseWindsor

Prince Charles,Prince of Wales

Princess Anne, Princess Royal

CommanderTimothy Laurence

Prince Andrew,Duke of York

SarahMargaret Ferguson

Prince Edward, Earl of Wessex

Sophie Rhys-Jones

Elizabeth II ofthe United Kingdom

Prince Philip,Duke of Edinburgh

CaptainMark Phillips

An example: British Royal Family

Page 6: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Haplotype Reconstruction - Haplotype: useful, expensive - Genotype: cheaper

1 21 2

1 21 2

M C

1 21 2

1 21 2

1 21 2

M C

1 21 2

(a)

1 21 2

1 22 1

M C

1 21 2

(b)

Reconstruct haplotypes from genotypes

Page 7: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Problem Definition MRHC problem Given a pedigree and the genotype

information for each member, find a haplotype configuration for each member which obeys Mendelian law, s.t. the number of recombinants are minimized.

Page 8: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Problem Definition Variants of MRHC

Tree-MRHC: no mating loop Binary-tree-MRHC: 1 mate, 1 child 2-locus-MRHC: 2 loci 2-locus-MRHC*: 2 loci with missing data

Page 9: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Previous Work The known hardness results for Mendelian law checking

Loop? Multi-allelic? Hardness

Yes Yes NP-hard [AHI+03]No P [AHI+03]

No P [AHI+03]

The known hardness results for MRHC

NP-hard [LJ03] P [LJ03]

P [DLJ03]NP-hard [DLJ03]

2-locus-MRHCTree-MRHC with

bounded #membersTree-MRHC withbounded #loci

Tree-MRHC

Hardness

Page 10: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Our hardness and approximation results

Lower boundof approx.

ratio

Any f(n)

Any f(n)

Any constant

Assumption

P≠ NP

P≠ NP P≠ NP

the Unique GamesConjecture[Khot02]

Binary-tree-MRHC

2-locus-MRHC*Binary-tree-

MRHC*

2-locus-MRHC

Hardness

NP

Tree-MRHC Any constant P≠ NP

the Unique GamesConjecture

Upper boundof approx.

ratio

O ( )

The lower boundholds for

2-locus-MRHC*(4,1)

Binary-tree-MRHC*(1,1)

2-locus-MRHC(16,15)

Tree-MRHC(1,u)Tree-MRHC(u,1)

)log(n

Page 11: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Our hardness and approximation results

Lower boundof approx.

ratio

Any f(n)

Any f(n)

Any constant

Assumption

P≠ NP

P≠ NP P≠ NP

the Unique GamesConjecture[Khot02]

Binary-tree-MRHC

2-locus-MRHC*Binary-tree-

MRHC*

2-locus-MRHC

Hardness

NP

Tree-MRHC Any constant P≠ NP

the Unique GamesConjecture

Upper boundof approx.

ratio

O ( )

The lower boundholds for

2-locus-MRHC*(4,1)

Binary-tree-MRHC*(1,1)

2-locus-MRHC(16,15)

Tree-MRHC(1,u)Tree-MRHC(u,1)

)log(n

Page 12: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Outline Introduction and problem definition Deciding the complexity of binary-tree-MRHC Approximation of MRHC with missing data Approximation of MRHC without missing data Approximation of bounded MRHC Conclusion

Page 13: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

A verifier for ≠3SAT (1) Given a truth assignment for literals

in a 3CNF formula Consistency checking for each variable Satisfiability checking for each clause

Page 14: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Binary-tree-MRHC is NP-hard

(A) C’s genotype

1 21 2

(B) Two haplotype

1 21 2

1 22 1

configurations

1 22 1

1 21 2

1 22 1

1 22 1

1 22 1

1 22 1

1 22 1

1 21 2

1 21 2

M C M MC C

(a) (b) (c)

C can check if M have certain haplotype configuration!!

Page 15: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Binary-tree-MRHC is NP-hardO1 O2 B1A1

BtAt

Bt+1At+1

Bt+2At+2

Bt+3At+3

Bt+3mAt+3m...

M2

M1

...

Mt-1

Mt

B2A2C1

C2

Ct

Part 1 (#recombinants >=0)

Part 2(#recombinants >=#clauses)

Ct+1

Mt+1Ct+2

Mt+2Ct+3

Mt+3m-1 Ct+3m

Mt+3m

consistencychecking

satisfiabilitychecking

The pedigree

≠3SAT is satisfiable OPT(MRHC)=#clauses

Page 16: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Outline Introduction and problem definition Deciding the complexity of binary-tree-MRHC Approximation of MRHC with missing data Approximation of MRHC without missing data Approximation of bounded MRHC Conclusion

Page 17: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Inapproximability of 2-locus -MRHC*

Definition: A minimization problem R cannot be approximated -There is not an approximation algorithm with ratio f(n)

unless P=NP.-f(n) is any polynomial-time computable function

Fact: If it is NP-hard to decide whether OPT(R)=0, R cannot be approximated unless P=NP.

Page 18: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Inapproximability of 2-locus -MRHC*

1 21 2

x

(A) gadget for variable x

x1 21 2

1 12 1

1 21 2

x

*1 2

2

1 21 2

2 22 2

1 22

1 22

1 21 2

2 22 2

1 11 *

y

z

*

*1 2

2*

2 22 2

zyx (B) gadget for clause

Reduce 3SAT to 2-locus-MRHC*

3SAT is satisfiableOPT(2-locus-MRHC*)=0

2-locus-MRHC* cannot be approximatedunless P=NP!!

False

True1 21 21 22 1

Page 19: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Outline Introduction and problem definition Deciding the complexity of binary-tree-MRHC Approximation of MRHC with missing data Approximation of MRHC without missing data Approximation of bounded MRHC Conclusion

Page 20: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Upper Bound of 2-locus-MRHC Main idea: use a Boolean variable to capture the configuration; use clauses to capture the recombinants.

An example

1 21 2

1 21 2

1 11 1

A B

1 22 1

FalseTrue 1 21 2

1 21 2

1 22 1

1 11 1

A B

)( BA

Page 21: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Upper Bound of 2-locus-MRHC The reduction from 2-locus-MRHC to Min 2CNF Deletion

Genotype of theMother (A)

Genotype of theFather (B) Genotype of the Child (C) 2CNF Constraint

1 11 1

2 22 2 )( )( )(2 BABABA

2 21 1

1 12 2 )( )( )(2 BABABA

1 11 2

1 21 1

2 21 2

1 22 2

1 21 2 )( )( )( )( CBCBCACA

1 21 2

1 21 2

1 11 1

2 22 2 A

2 21 1

1 12 2

A

1 21 2 )( )( CACA

1 21 2

X XY X

Y XX X

Y XX XY XY Y

X XY X

Y XX X

X XX YY YX Y

A

A

A

A

)( )( BABA

Page 22: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Upper Bound of 2-locus-MRHC

)log(n

Recently, Agarwal et al. [STOC05] presented an O ( ) randomized approximation algorithm for Min 2CNF Deletion.

)log(n 2-locus-MRHC has O ( ) approximation algorithm.

Page 23: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Outline Introduction and problem definition Deciding the complexity of binary-tree-MRHC Approximation of MRHC with missing data Approximation of MRHC without missing data Approximation of bounded MRHC Conclusion

Page 24: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Approximation Hardness of bounded MRHC

Bound #mates and #children 2-locus-MRHC: (16,15) 2-locus-MRHC*: (4,1) tree-MRHC: (u,1) or (1,u)

Page 25: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Conclusion Our hardness and approximation results

Lower boundof approx.

ratio

Any f(n)

Any f(n)

Any constant

Assumption

P≠ NP

P≠ NP P≠ NP

the Unique GamesConjecture

Binary-tree-MRHC

2-locus-MRHC*Binary-tree-

MRHC*

2-locus-MRHC

Hardness

NP-hard

Tree-MRHC Any constant P≠ NP

the Unique GamesConjecture

Upper boundof approx.

ratio

O ( )

The lower boundholds for

2-locus-MRHC*(4,1)

Binary-tree-MRHC*(1,1)

2-locus-MRHC(16,15)

Tree-MRHC(1,u)Tree-MRHC(u,1)

)log(n

Page 26: Complexity and Approximation of the Minimum Recombinant Haplotype Configuration Problem

Thanks for your time and

attention!