1 Delayed-Dynamic-Selective (DDS) Prediction for Reducing Extreme Tail Latency in Web Search Saehoon...

Delayed-Dynamic-Selective (DDS) Prediction for Reducing Extreme Tail

Latency in Web Search

Saehoon Kim§, Yuxiong He*, Seung-won Hwang§, Sameh Elnikety*, Seungjin Choi§

Web Search Engine Requirement

Queries

High quality + Low latency

This talk focuses on how to achieve low latency without compromising the quality

Low Latency for All Users

• Reduce tail latency (high-percentile response time)

• Reducing average latency is not sufficient

Latency

Commercial search engine reduces 99th-percentile latency

Reducing End-to-End Latency

Long(-running )query

Aggregator

ISN ISN ISNISN

40 Index Server Nodes (ISNs)

The 99th–percentile response time < 120ms

The 99.99th–percentile response time < 120ms

Reducing Tail Latency by Parallelization

Opportunity of Parallelization

1. Available idle cores

2. CPU-intensive workloads

Resource Latency

Network 4.26 ms

Queueing 0.15 ms

I/O 4.70 ms

CPU 194.95 ms

Challenges of Exploiting Parallelism

• Parallelizing all queries– Inefficient under medium

to high load

• Parallelizing short queries– No speed up

• Parallelizing long queries– Good speed up

Parallelize only long(-running) queries

Prior Work - PREDictive Parallelization

• Predict the query execution time • Parallelize the predicted long queries only• Execute the predicted short queries sequentially

“WSDM” Long

FeatureExtraction

Regressionfunction

Prediction model

Predictive Parallelization: Taming Tail Latencies in Web Search, [M. Jeon, SIGIR’14]

Requirements

• 99th tail latency at aggregator <= 120ms• Reduce 99.99th tail latency at each ISN <=

Recall PrecisionRequirements >= 98.9% Should be high

Reason To optimize 99.99th tail latency

Less queries to be parallelized

PRED 98.9% 1.1%PRED cannot effectively reduce

99.99th tail latency

Contributions

• Key Contributions:1. Proposes DDS (Delayed-Dynamic-Selective)

prediction to achieve very high recall and good precision

2. Use DDS prediction to effectively reduce extreme tail latency

Overview of DDS

Finished

Queries < 10ms

Delayed prediction

Queries > 10ms

Predictor for execution time

Dynamic prediction

Predictor for confidence level

Not confident

Selective prediction

Delayed Prediction

1) Complete many short queries sequentially

2) Collect dynamic features

Dynamic Features

• What are dynamic features?– Features that can only be collected at runtime

• Two categories– NumEstMatchDocs: to estimate the total #

matched docs– DynScores: to predict early termination

Primary Factors for Execution Time

Processing

Doc 1 Doc 2 Doc 3 ……. Doc N-2 Doc N-1 Doc N

Docs sorted by static scoresHighest LowestWeb

documents

……. …….

1. # total matched documents

Inverted index for “WSDM”Inverted index for “2015”

Primary Factors for Execution Time

Processing

documents

……. …….

1. # total matched documents

Inverted index for “WSDM”Inverted index for “2015”

2. Early terminationNot evaluated

Early Termination

Inverted index for “WSDM”

Processing Not evaluated

documents

……. …….

Top-3 Results If min. Dynamic score > threshold, then stop.

Doc ID Dynamic Score

Doc 1 -4.11

Doc 3 -4.01

Doc 1 -4.11

Doc 3 -4.01

Doc 1 -4.11

Doc 5 -4.23

Doc 3 -4.01

Doc 8 -4.10

Doc 1 -4.11

To predict early termination,Consider a dynamic score distribution

Importance of Dynamic Features

• Top-10 feature importance by boosted regression tree

• NumEstMachDoc helps to predict # total matched docs

• DynScore helps to predict early termination

Selective Prediction

• Find out almost all long queries with good precision

• Identify the outliers (long query predicted as short)

Predicted execution time

Selective Prediction

Predicted execution time

Predicted error

Long queries

Short queries

Overview of DDS

Finished

Queries < 10ms

Delayed prediction

Queries > 10ms

Predictor for execution time

Dynamic prediction

Predictor for confidence level

Not confident

Evaluations of Predictor Accuracy (1/3)

• Baseline (PRED)– Static features with no delayed prediction – IDF, Static score (e.x. PageRank), etc.

• Proposed method (DDS)– Dynamic (+static) features with Delayed and

• 69,010 Bing queries at production workload– 14,565 queries >= 10ms– 635 queries >= 100ms

• Boosted regression tree with 10-fold cross validation– For PRED, we use 69,010 queries – For DDS, we use 14,565 queries

957% Improvement over PRED

Delayed

Dynamic features

Selective features

Simulation Results on Tail Latency Reduction

• Baseline (PRED)– Predict query execution time before running it– Parallelize the long query with 4-way parallelism

• Proposed method (DDS)– Run a query for 10ms sequentially – Parallelizes the long or unpredictable queries with

4-way parallelism

ISN Response Time

70% throughput increase

Aggregator Response Time

DDS can optimize 99th-percentile tail latency at aggregator under high QPS

Conclusion

• Proposes a novel prediction framework– Delayed prediction/Dynamic features/Selective

prediction– Achieves a high precision and recall compared to

PRED• Reduces 99th-percentile aggregator response

time <= 120ms under high load!

1 Delayed-Dynamic-Selective (DDS) Prediction for Reducing Extreme Tail Latency in Web Search Saehoon...

Documents

Transcript of 1 Delayed-Dynamic-Selective (DDS) Prediction for Reducing Extreme Tail Latency in Web Search Saehoon...

1 Tashkent: Uniting Durability & Ordering in Replicated Databases Sameh Elnikety, EPFL Steven Dropsho, EPFL Fernando Pedone, USI.

Guardat: Enforcing data policies at the storage layer Anjo Vahldiek-Oberwagner 1, Eslam Elnikety 1, Aastha Mehta 1, Deepak Garg 1, Peter Druschel 1, Rodrigo.

Deep Mixed Effect Model using Gaussian …Deep Mixed Effect Model using Gaussian Processes: A Personalized and Reliable Prediction for Healthcare Ingyo Chung,1 Saehoon Kim,2 Juho Lee,2

Serializability with Snapshot Isolation under the Hood...Serializability with Snapshot Isolation under the Hood Mihaela Bornea 1, S. Elnikety 2, O. Hodson 2, A Fekete 3 1IBM Research

Arijit Khan Systems Group ETH Zurich Sameh Elnikety Microsoft Research Redmond, WA.

Predicting Replicated Database Scalability from Standalone ......Predicting Replicated Database Scalability from Standalone Database Profiling Sameh Elnikety Microsoft Research Cambridge,

ERIM: Secure, Efficient In-process Isolation with Protection Keys … · ERIM: Secure, Efﬁcient In-process Isolation with Protection Keys (MPK) Anjo Vahldiek-Oberwagner Eslam Elnikety

Thoth: ComprehensivePolicy Compliance in Data Retrieval ...dg/papers/sec16.pdf · Thoth: ComprehensivePolicy Compliance in Data Retrieval Systems Eslam Elnikety Aastha Mehta Anjo

Zhang O povo português, segundo Teófilo Braga: raça e Yuxiong … povo... · 2017. 10. 2. · se compreende o destaque que Teófilo Braga concede ao estudo do povo lusitano, que

Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

1 Database Replication Using Generalized Snapshot Isolation Sameh Elnikety, EPFL Fernando Pedone, USI Willy Zwaenepoel, EPFL.

Thoth: ComprehensivePolicy Compliance in Data Retrieval ...druschel/publications/thoth.pdf · Thoth: ComprehensivePolicy Compliance in Data Retrieval Systems Eslam Elnikety Aastha

Processing and Optimizing Main Memory Spatial … and Optimizing Main Memory Spatial-Keyword Queries Taesung Lee1 Jin-woo Park2 Sanghoon Lee2 Seung-won Hwang1 Sameh Elnikety3 Yuxiong

Clock-RSM: Low-Latency Inter-Datacenter State Machine Replication Using Loosely Synchronized Physical Clocks Jiaqing Du, Daniele Sciascia, Sameh Elnikety.

AbstractAccelerating Training of Transformer-Based Language Models with Progressive Layer Dropping Minjia Zhang Yuxiong He Microsoft Corporation {minjiaz,yuxhe}@microsoft.com Abstract

1. Introduction CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety Copyright ©2012 Philip A. Bernstein 1/4/2012 1.

A Cooperative Internet Backup Scheme [1] Leonid Bilevich Advanced Topics in Storage Systems [1]M. Lillibridge, S. Elnikety, A. Birrell, M. Burrows, and.

2/29/2012 1 10. Replication CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety Copyright ©2012 Philip A. Bernstein.

System Support for Managing Graphs in the Cloud Sameh Elnikety & Yuxiong He Microsoft Research.

Horton+: A Distributed System for Processing Declarative Reachability Queries over Partitioned Graphs Mohamed Sarwat (Arizona State University) Sameh Elnikety.