20110620 amst rdam_kpb

IntroductionComputing in databases

Conclusion

Computing near the data:let someone else do the heavy lifting for you

Konrad Banachewicz

AmstRdam, June 20th 2011

Konrad Banachewicz Computing near the data

Conclusion

”We’re drowning in data and starving for information”

Conclusion

Data coming in from the market:

1 liquid instrument (front month DAX Future), 1 day, 1exchange → 400 MB in pure ASCII

different parameters → ”clones” of the same instrument

{ exchanges } x { instruments } x { days }...= A LOT

Conclusion

Problems:

memory

bandwidth

Conclusion

Model 1: regressionModel 2: correlationModel 3: VaR

Typical approach

read the data to memory

analyze there

save the results

Conclusion

Typical approach

analyze there

save the results

Conclusion

Typical approach

analyze there

save the results

Conclusion

Typical approach

analyze there

save the results

Conclusion

But is it really necessary?

Conclusion

In many cases what we really need is aggregate info:Example: linear regression

classic estimatorβ̂ = (XTX )−1XT y

come to think about it, what we really need are sums, sums ofsquares and cross-products

Conclusion

Two possible approaches:

1 Ripley i Chen: extra interface, pure R

2 R + SQL

Conclusion

Ripley i Chen

R(user) // CORBA // R(servant)

��DB

Conclusion

Alternative

R(user) // DBoo

Two scenarios:

1 pure R processing

2 computations partially in DB

Conclusion

base model:Yt = β1 + β2Xt + εt

estimator:

β̂ =(XTX

)−1XTY

in the DB: arithmetic operations on a limited set of columns

Conclusion

estimator:

β̂ =(XTX

)−1XTY

Conclusion

estimator:

β̂ =(XTX

)−1XTY

Conclusion

estimator:

β̂ =(XTX

)−1XTY

Conclusion

Pure R processing

200000 400000 600000 800000 1000000

Case study 1, method 1

Dataset size (number of rows)

Ingres VWIngresMySQLPostgreSQLDBMS X

Conclusion

Computations partially in DB

200000 400000 600000 800000 1000000

Conclusion

base model:

Cov(X ,Y ) = E [XY ]− EXEY

estimator:

ˆCov(X ,Y ) =1

n∑i=1

XiYi −

n∑i=1

in the DB: large queries

Conclusion

base model:

estimator:

ˆCov(X ,Y ) =1

n∑i=1

XiYi −

n∑i=1

Conclusion

base model:

estimator:

ˆCov(X ,Y ) =1

n∑i=1

XiYi −

n∑i=1

Conclusion

base model:

estimator:

ˆCov(X ,Y ) =1

n∑i=1

XiYi −

n∑i=1

Conclusion

Pure R processing

15 20 25 30 35

Dataset size (columns)

Conclusion

15 20 25 30 35

Dataset size (columns)

Conclusion

calculate a quantile of the portfolio PnL

Vp = inf {u : F (u) ≥ 1− p}

estimator:V̂p = X[n(1−p)]+1

in the DB: sorting

Conclusion

Vp = inf {u : F (u) ≥ 1− p}

in the DB: sorting

Conclusion

Vp = inf {u : F (u) ≥ 1− p}

in the DB: sorting

Conclusion

Vp = inf {u : F (u) ≥ 1− p}

in the DB: sorting

Conclusion

Pure R processing

2000000 4000000 6000000 8000000 10000000

Conclusion

200000 400000 600000 800000 1000000

Conclusion

1 with minimal effort, significant speedups are possible

2 ODBC as minimal requirement

3 extensions: parallel computing...

20110620 amst rdam_kpb

Small Business & Entrepreneurship

Transcript of 20110620 amst rdam_kpb

AMST Active Draft II

Online version of pesentation on copyright for AMST 428G

AMST paper

AMST Card April 2015 JKW-MH_MOO 2x2 front

Component Overview AST & AMST - Texas EMTFtxemtf.org/wp-content/uploads/2019/09/AST-AMST-FINAL.pdf · Component Overview [Ambulance plans in addition to being available fo AMST: [5]

AmSt 175 / Art 191: American Architecture 1600-1860archhist/175Bibliography.pdf · AmSt 175 / AH 191: American Architecture 1600-1860 Fall 2008 ... Vernacular Architecture in the

CONTRATO 14 LP 11/2017 AMST - santatecla.gob.svsantatecla.gob.sv/transparencia/documentos/LP11-2017-AMST... · cosidas en el equipo de carga y las correspondientes cintas y ... Si

Applied Microstructures Corporate Overviewen.ultrapro.com.tw/userfiles/files/AMST UltraPro MEMS China 2010 0… · AMST Overview • Applied Microstructures (AMST) provides enabling

20110620 i Grado Trasfer Pro Vinci Ali

00PAT 45min Felds 20110620 - University of Toronto

Vvaa observatorio urbanistico amst p 02-ene2009

AMST 20100/POL 42900 HEALTH, SUSTAINABILITY AND … · 2016-11-17 · AMST 20100/POL 42900 HEALTH, ... and activity centers, ... Learning Objectives 1. students will learn the basic

20110620 NRG Fixed Income Weekly

GBSS13.0 Basic Feature Description V1.3(20110620)

st invest NL 20110620

AMST 30100 Virtual Reality - Purdue University

AMST 101mgw (10317) Race and Class in Los Angeles · Course requirements: presentation, midterm paper, and final project (students , video explainer,or essay). AMST 140gw Borders

Appleton - AMST Campaign Newsletter 2004

Amst 522 research proposal

Amst 409 final presentation