Deal or Deceit: Detecting Cheating in Distribution Channels

28
Kai Shu 1,2 , Ping Luo 1 Li Wan 2 , Peifeng Yin 3 , Linpeng Tang 4 Deal or Deceit: Detecting Cheating in Distribution Channels 2 1 3 4

Transcript of Deal or Deceit: Detecting Cheating in Distribution Channels

Page 1: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu1,2, Ping Luo1

Li Wan2, Peifeng Yin

3, Linpeng Tang

4

Deal or Deceit:

Detecting Cheating in Distribution Channels

2

1

3 4

Page 2: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Outline

2

Introduction

• Motivation

• Problem Formulation

• Detection Framework

• Experimental results

• Conclusion

Page 3: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Introduction

3

Manufacturer

End user

Manufacturer

Partner

End user

Low

price

High

price

Price

Difference

Distribution channel Price difference

Manufacturer

Partner Partner

Page 4: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Introduction

4

×

Cheating seller Cheating buyer

Detect cheating in distribution channel

×

A typical cheating scenario

Page 5: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Outline

5

• Introduction

Motivation

• Problem Formulation

• Detection Framework

• Experimental results

• Conclusion

Page 6: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Motivation

6

Time series of purchase quantities

Time series of purchase quantities for each partner

Data: from the perspective of manufactory

Page 7: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Motivation (2)

7

0

20

40

60

80

100

120

140

160

180

200

220

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Purchase quantity

Month

𝑣1 𝑣2 𝑣1 𝑣2 𝑣1 𝑣2

×

×𝑣1 𝑣2

Negative

correlation

The purchase quantities of 𝑣1 and 𝑣2 change collectively

Observation

Page 8: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Outline

8

• Introduction

• Motivation

Problem Formulation

• Detection Framework

• Experimental results

• Conclusion

Page 9: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Problem Formulation

9

𝑥1 𝑥2 𝑥𝑛

𝑣1 𝑣2 𝑣𝑛

Degree of cheating

Input Output

Audit staff

Page 10: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Outline

10

• Introduction

• Motivation

• Problem Formulation

Detection Framework

• Experimental results

• Conclusion

Page 11: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Detection Framework

11

all purchase

volume sequences

(𝑥𝑖|𝑖 = 1,2, … , 𝑛)

Building Partner

Correlation Graph (PCG)

Partitioning

Ranking the partners by

probabilistic modelRanking of

cheating partners

A ···

B ···

Step 2

Step 3

Step 1

Page 12: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Detection Framework (Step 1)

12

0

20

40

60

80

100

120

140

160

180

200

220

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Purchase quantity

Month

𝑣1 𝑣2

• Tick-to-tick correspondence

Pearson correlation

• Symmetric measure

𝑣1

𝑣2 ?Cheating seller

Cheating buyer

Pearson 𝑣1, 𝑣2 = Pearson(𝑣2, 𝑣1)

No negative

correlation

Page 13: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Detection Framework (Step 1)

13

Dynamic time warping

0

20

40

60

80

100

120

140

160

180

200

220

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Purchase quantity

Month

𝑣1 𝑣2 𝑣1 𝑣2 𝑣1 𝑣2

Learn the correspondence relationship of elements between two time

series to minimize the Pearson correlation.

Warping direction

Directed Pearson Correlation (DPC)

𝑟𝑑𝑝𝑐(𝑥1, 𝑥2) ≔ min{1

𝐿

𝑙=1

𝐿

(𝑥1 𝑖𝑙 − 𝑥1𝜎𝑥1

)(𝑥2 𝑗𝑙 − 𝑥2𝜎𝑥2

)}

Where (𝑖𝑙 , 𝑗𝑙) is the element of a warping path 𝑃 and 𝐿 = 𝑃 .

Cheating seller

Cheating buyer

𝑟𝑑𝑝𝑐(𝑥1, 𝑥2) ≠ 𝑟𝑑𝑝𝑐(𝑥2, 𝑥1)

Page 14: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Detection Framework (Step 1)

14

𝑣𝑖 𝑣𝑗

𝑤𝑖𝑗 = −𝑟𝑑𝑝𝑐(𝑥𝑖 , 𝑥𝑗)

If 𝑤𝑖𝑗 ≥ 𝜂, keep the edge

×

If 𝑤𝑖𝑗 < 𝜂, remove the edge

all purchase

volume sequences

(𝑥𝑖|𝑖 = 1,2, … , 𝑛)

PCG

Step 1

Page 15: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Detection Framework (Step 2)

15

Cheating Sellers

Cheating Buyers

PCG

Bipartite Graph

A ···

B ···

Cheating Seller Cheating Buyerand Not true

A partner can only be cheating seller or cheating buyer

Page 16: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Detection Framework (Step 2)

16

Maximize 𝑤𝐴𝐵

Classical graph cut problem (MAX DICUT)

Cheating Sellers

Cheating Buyers

A ···

B ···

𝑤𝐴𝐵

𝑤𝐵𝐴

Maximize 3𝑤𝐴𝐵 −𝑤𝐵𝐴

New graph cut problem

We propose a greedy algorithm to solve this new NP-hard problem.

Solution algorithm

Page 17: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Detection Framework (Step 3)

17

A ···

B ···

𝑣𝑖(𝛼𝑖 , 𝛽𝑖)

𝑣𝑗(𝛼𝑗 , 𝛽𝑗)

𝑤𝑖𝑗

𝑣𝑖(𝛼𝑖 , 𝛽𝑖)

𝑣𝑗(𝛼𝑗 , 𝛽𝑗)

𝑤𝑖𝑗

Generate the edges and the weights in resultant bipartite graph.

Probabilistic model

Maxmize log 𝑝 𝑤𝑖𝑗 , 𝜽 + 𝜆 ∙ 𝐶(𝜽)

𝜽 is the set of all parameters

Bipartite Graph

Probabilistic model

𝐵𝑒𝑡𝑎 𝑤𝑖𝑗; 𝛼𝑖 , 𝛽𝑗

𝛼𝑖 𝛽𝑗 𝑤𝑖𝑗𝛽𝑗 𝑤𝑖𝑗𝛼𝑖

Page 18: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Detection Framework (Step 3)

18

𝜋𝑖𝑗 =𝛼𝑖𝛼𝑖 + 𝛽𝑗

𝜋𝑖𝑗 is the expected value of beta distribution with parameters (𝛼𝑖 , 𝛽𝑗)

Probability weight 𝜋𝑖𝑗

Ranking score function

Bipartite Graph

A ···

B ···

Ranking of

cheating partners

Page 19: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Outline

19

• Introduction

• Motivation

• Problem Formulation

• Detection Framework

Experimental results

• Conclusion

Page 20: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Experimental results

20

Dataset

Gold Silver All

# Total partners 104 424 528

# Cheating partners 17 85 102

Evaluation measure

For the top-k ranking list, giving a number 𝑘,we

count the number of hits in it.

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛@𝑘 =#hits in top−k list

𝑘

𝑅𝑒𝑐𝑎𝑙𝑙@𝑘 =#hits in top−k list

𝑀

𝐹1@𝑘 = 2 ∙𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛@𝑘 ∙ 𝑅𝑒𝑐𝑎𝑙𝑙@𝑘

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛@𝑘 + 𝑅𝑒𝑐𝑎𝑙𝑙@𝑘

Ground truth

k1 2 3 k-1

Precision

Recall

F1 score

k

evalu

ation

hits

AUC

Page 21: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Experimental results (1)

21

• DPC improves performance

0

5

10

15

20

25

30

35

40

AU

C_F1

Pearson_Rank Noncut_Rank Noncut_ProbRank Cut_Rank Cut_ProbRank

PCG Partition Ranking

Pearson correlation DPC No Yes Edge weight Probability

Noncut_Rank √ √ √

Noncut_ProbRank √ √ √

Cut_Rank √ √ √

Cut_ProbRank √ √ √

Pearson_Rank √ √ √

Page 22: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Experimental results (2)

22

• DPC improves performance

0

5

10

15

20

25

30

35

40

AU

C_F1

Pearson_Rank Noncut_Rank Noncut_ProbRank Cut_Rank Cut_ProbRank

PCG Partition Ranking

Pearson correlation DPC No Yes Edge weight Probability

Noncut_Rank √ √ √

Noncut_ProbRank √ √ √

Cut_Rank √ √ √

Cut_ProbRank √ √ √

Pearson_Rank √ √ √

Noncut_Rank Cut_RankNoncut_ProbRank Cut_ProbRank

• Partitioning improves

performance

Page 23: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Experimental results (3)

23

• DPC improves performance

0

5

10

15

20

25

30

35

40

AU

C_F1

Pearson_Rank Noncut_Rank Noncut_ProbRank Cut_Rank Cut_ProbRank

PCG Partition Ranking

Pearson correlation DPC No Yes Edge weight Probability

Noncut_Rank √ √ √

Noncut_ProbRank √ √ √

Cut_Rank √ √ √

Cut_ProbRank √ √ √

Pearson_Rank √ √ √

• Partitioning improves

performance

Noncut_Rank Noncut_ProbRank Cut_Rank Cut_ProbRank

• Probability ranking improves

performance

Page 24: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Experimental results (4)

24

• DPC improves performance

0

5

10

15

20

25

30

35

40

AU

C_F1

Pearson_Rank Noncut_Rank Noncut_ProbRank Cut_Rank Cut_ProbRank

PCG Partition Ranking

Pearson correlation DPC No Yes Edge weight Probability

Noncut_Rank √ √ √

Noncut_ProbRank √ √ √

Cut_Rank √ √ √

Cut_ProbRank √ √ √

Pearson_Rank √ √ √

• Partitioning improves

performance

Cut_ProbRank

• Probability ranking improves

performance

• Cut_ProbRank performs best

Page 25: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Experimental results (5)

25

Precision Recall F1 score

Cut_ProbRank always performs the best

Page 26: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Outline

26

• Introduction

• Motivation

• Problem Formulation

• Detection Framework

• Experimental results

Conclusion

Page 27: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Conclusion

27

The first quantitative work to detect cheating in distribution

channels.

• A new correlation method (i.e. DPC) by Introducing

dynamic time warping technique.

• A new graph cut problem for graph partitioning.

• A probability model for node ranking.

Page 28: Deal or Deceit: Detecting Cheating in Distribution Channels

Kai Shu et al. Deal or Deceit: Detecting Cheating in Distribution Channels

Thanks

28