Bayesian Inference Ekaterina Lomakina TNU seminar: Bayesian inference 1 March 2013.
Applied Bayesian Inference with PyMC
-
Upload
marco-santoni -
Category
Software
-
view
238 -
download
4
Transcript of Applied Bayesian Inference with PyMC
![Page 1: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/1.jpg)
Applied Bayesian Inference with PyMC
@MrSantoni
![Page 2: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/2.jpg)
Which color will sell more?
Page A
A Tea Pot
Lorem ipsum dolor sit amet, nemore accusam mel ne, usu offendit delicata id, idque splendide constituam ex vel. Sea in nemore impedit singulis, vivendo sadipscing cum ea. Eum debet torquatos prodesset cu. Mel id mollis comprehensam, nemore verear mei cu.
Mei meis iuvaret vituperata ad, ne cetero iisque singulis eum. Ex magna latine virtute nam, ne graecis dissentias eloquentiam ius. Nam alienum omittam no. Eu vix docendi maiestatis signiferumque, alienum officiis delicata te pri, commodo corrumpit deterruisset eu cum. An mei tincidunt incorrupte dissentias, prompta diceret delenit vis ad.
Sea ad sadipscing intellegebat, quod sumo mea cu, ei eos feugait alienum nominavi. Ei vix simul possit. Recteque tincidunt incorrupte pri no, ipsum constituam eu quo. Per ne populo quodsi persius, molestie efficiantur et his. Munere discere vis id, te sea homero suscipiantur definitionem, quot dicam vis ne.
BUY
Page B
A Tea Pot
Lorem ipsum dolor sit amet, nemore accusam mel ne, usu offendit delicata id, idque splendide constituam ex vel. Sea in nemore impedit singulis, vivendo sadipscing cum ea. Eum debet torquatos prodesset cu. Mel id mollis comprehensam, nemore verear mei cu.
Mei meis iuvaret vituperata ad, ne cetero iisque singulis eum. Ex magna latine virtute nam, ne graecis dissentias eloquentiam ius. Nam alienum omittam no. Eu vix docendi maiestatis signiferumque, alienum officiis delicata te pri, commodo corrumpit deterruisset eu cum. An mei tincidunt incorrupte dissentias, prompta diceret delenit vis ad.
Sea ad sadipscing intellegebat, quod sumo mea cu, ei eos feugait alienum nominavi. Ei vix simul possit. Recteque tincidunt incorrupte pri no, ipsum constituam eu quo. Per ne populo quodsi persius, molestie efficiantur et his. Munere discere vis id, te sea homero suscipiantur definitionem, quot dicam vis ne.
BUY
![Page 3: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/3.jpg)
Page A
A Tea Pot
Lorem ipsum dolor sit amet, nemore accusam mel ne, usu offendit delicata id, idque splendide constituam ex vel. Sea in nemore impedit singulis, vivendo sadipscing cum ea. Eum debet torquatos prodesset cu. Mel id mollis comprehensam, nemore verear mei cu.
Mei meis iuvaret vituperata ad, ne cetero iisque singulis eum. Ex magna latine virtute nam, ne graecis dissentias eloquentiam ius. Nam alienum omittam no. Eu vix docendi maiestatis signiferumque, alienum officiis delicata te pri, commodo corrumpit deterruisset eu cum. An mei tincidunt incorrupte dissentias, prompta diceret delenit vis ad.
Sea ad sadipscing intellegebat, quod sumo mea cu, ei eos feugait alienum nominavi. Ei vix simul possit. Recteque tincidunt incorrupte pri no, ipsum constituam eu quo. Per ne populo quodsi persius, molestie efficiantur et his. Munere discere vis id, te sea homero suscipiantur definitionem, quot dicam vis ne.
BUY
Page B
A Tea Pot
Lorem ipsum dolor sit amet, nemore accusam mel ne, usu offendit delicata id, idque splendide constituam ex vel. Sea in nemore impedit singulis, vivendo sadipscing cum ea. Eum debet torquatos prodesset cu. Mel id mollis comprehensam, nemore verear mei cu.
Mei meis iuvaret vituperata ad, ne cetero iisque singulis eum. Ex magna latine virtute nam, ne graecis dissentias eloquentiam ius. Nam alienum omittam no. Eu vix docendi maiestatis signiferumque, alienum officiis delicata te pri, commodo corrumpit deterruisset eu cum. An mei tincidunt incorrupte dissentias, prompta diceret delenit vis ad.
Sea ad sadipscing intellegebat, quod sumo mea cu, ei eos feugait alienum nominavi. Ei vix simul possit. Recteque tincidunt incorrupte pri no, ipsum constituam eu quo. Per ne populo quodsi persius, molestie efficiantur et his. Munere discere vis id, te sea homero suscipiantur definitionem, quot dicam vis ne.
BUY
#buy / N #buy / N
![Page 4: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/4.jpg)
• What if N is small?• What is N to have 90% confidence?• What if N is different on A and B?
![Page 5: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/5.jpg)
Bayesian Inference
![Page 6: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/6.jpg)
Probability:
Claim: we think Bayesian
FrequentistBayesian
FrequenceBelief
![Page 7: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/7.jpg)
test 1 test 2 test 3
Claim: we think Bayesian
no-bugs confidence
![Page 8: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/8.jpg)
Bayesian Inference =
update your beliefs
new evidence
prior belief
![Page 9: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/9.jpg)
The Developer View
Statistical Problem
def frequentist(): return 80%
def bayesian(): return0% 100%
![Page 10: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/10.jpg)
How to?
0% 100%
![Page 11: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/11.jpg)
How to?
𝑃 ( 𝐴|𝐵 )=𝑃 (𝐵|𝐴 )𝑃 (𝐴)
𝑃 (𝐵)
Closed-form solution:
Realistic Cases
Toy Examples
0% 100%
![Page 12: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/12.jpg)
PyMC
![Page 13: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/13.jpg)
PyMC
• Perform Bayesian Inference• Markov Chain Monte Carlo techniques• A.k.a. Probabilistic Programming
![Page 14: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/14.jpg)
Show me the code!
![Page 15: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/15.jpg)
Example A/B test
![Page 16: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/16.jpg)
Only one difference between A and B
Page A
A Tea Pot
Lorem ipsum dolor sit amet, nemore accusam mel ne, usu offendit delicata id, idque splendide constituam ex vel. Sea in nemore impedit singulis, vivendo sadipscing cum ea. Eum debet torquatos prodesset cu. Mel id mollis comprehensam, nemore verear mei cu.
Mei meis iuvaret vituperata ad, ne cetero iisque singulis eum. Ex magna latine virtute nam, ne graecis dissentias eloquentiam ius. Nam alienum omittam no. Eu vix docendi maiestatis signiferumque, alienum officiis delicata te pri, commodo corrumpit deterruisset eu cum. An mei tincidunt incorrupte dissentias, prompta diceret delenit vis ad.
Sea ad sadipscing intellegebat, quod sumo mea cu, ei eos feugait alienum nominavi. Ei vix simul possit. Recteque tincidunt incorrupte pri no, ipsum constituam eu quo. Per ne populo quodsi persius, molestie efficiantur et his. Munere discere vis id, te sea homero suscipiantur definitionem, quot dicam vis ne.
BUY
Page B
A Tea Pot
Lorem ipsum dolor sit amet, nemore accusam mel ne, usu offendit delicata id, idque splendide constituam ex vel. Sea in nemore impedit singulis, vivendo sadipscing cum ea. Eum debet torquatos prodesset cu. Mel id mollis comprehensam, nemore verear mei cu.
Mei meis iuvaret vituperata ad, ne cetero iisque singulis eum. Ex magna latine virtute nam, ne graecis dissentias eloquentiam ius. Nam alienum omittam no. Eu vix docendi maiestatis signiferumque, alienum officiis delicata te pri, commodo corrumpit deterruisset eu cum. An mei tincidunt incorrupte dissentias, prompta diceret delenit vis ad.
Sea ad sadipscing intellegebat, quod sumo mea cu, ei eos feugait alienum nominavi. Ei vix simul possit. Recteque tincidunt incorrupte pri no, ipsum constituam eu quo. Per ne populo quodsi persius, molestie efficiantur et his. Munere discere vis id, te sea homero suscipiantur definitionem, quot dicam vis ne.
BUY
![Page 17: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/17.jpg)
Page A
A Tea Pot
Lorem ipsum dolor sit amet, nemore accusam mel ne, usu offendit delicata id, idque splendide constituam ex vel. Sea in nemore impedit singulis, vivendo sadipscing cum ea. Eum debet torquatos prodesset cu. Mel id mollis comprehensam, nemore verear mei cu.
Mei meis iuvaret vituperata ad, ne cetero iisque singulis eum. Ex magna latine virtute nam, ne graecis dissentias eloquentiam ius. Nam alienum omittam no. Eu vix docendi maiestatis signiferumque, alienum officiis delicata te pri, commodo corrumpit deterruisset eu cum. An mei tincidunt incorrupte dissentias, prompta diceret delenit vis ad.
Sea ad sadipscing intellegebat, quod sumo mea cu, ei eos feugait alienum nominavi. Ei vix simul possit. Recteque tincidunt incorrupte pri no, ipsum constituam eu quo. Per ne populo quodsi persius, molestie efficiantur et his. Munere discere vis id, te sea homero suscipiantur definitionem, quot dicam vis ne.
BUY
Page B
A Tea Pot
Lorem ipsum dolor sit amet, nemore accusam mel ne, usu offendit delicata id, idque splendide constituam ex vel. Sea in nemore impedit singulis, vivendo sadipscing cum ea. Eum debet torquatos prodesset cu. Mel id mollis comprehensam, nemore verear mei cu.
Mei meis iuvaret vituperata ad, ne cetero iisque singulis eum. Ex magna latine virtute nam, ne graecis dissentias eloquentiam ius. Nam alienum omittam no. Eu vix docendi maiestatis signiferumque, alienum officiis delicata te pri, commodo corrumpit deterruisset eu cum. An mei tincidunt incorrupte dissentias, prompta diceret delenit vis ad.
Sea ad sadipscing intellegebat, quod sumo mea cu, ei eos feugait alienum nominavi. Ei vix simul possit. Recteque tincidunt incorrupte pri no, ipsum constituam eu quo. Per ne populo quodsi persius, molestie efficiantur et his. Munere discere vis id, te sea homero suscipiantur definitionem, quot dicam vis ne.
BUY
![Page 18: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/18.jpg)
Assume there isp_aprobability of clicking BUY when landing on Ap_bprobability of clicking BUY when landing on B
How to compute p_a and p_b?
![Page 19: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/19.jpg)
Page A– N_a visitors– C_a BUY-click on page A
Page B– N_b visitors– C_b BUY-click on page B
![Page 20: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/20.jpg)
Frequentist:C_a / N_a
BUT:Observed frequency does not necessarily equal p_a
![Page 21: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/21.jpg)
Bayesian:Infer true frequency from observed data
![Page 22: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/22.jpg)
Page A
A Tea Pot
Lorem ipsum dolor sit amet, nemore accusam mel ne, usu offendit delicata id, idque splendide constituam ex vel. Sea in nemore impedit singulis, vivendo sadipscing cum ea. Eum debet torquatos prodesset cu. Mel id mollis comprehensam, nemore verear mei cu.
Mei meis iuvaret vituperata ad, ne cetero iisque singulis eum. Ex magna latine virtute nam, ne graecis dissentias eloquentiam ius. Nam alienum omittam no. Eu vix docendi maiestatis signiferumque, alienum officiis delicata te pri, commodo corrumpit deterruisset eu cum. An mei tincidunt incorrupte dissentias, prompta diceret delenit vis ad.
Sea ad sadipscing intellegebat, quod sumo mea cu, ei eos feugait alienum nominavi. Ei vix simul possit. Recteque tincidunt incorrupte pri no, ipsum constituam eu quo. Per ne populo quodsi persius, molestie efficiantur et his. Munere discere vis id, te sea homero suscipiantur definitionem, quot dicam vis ne.
BUY
![Page 23: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/23.jpg)
Bayesian Worflow
1. Define prior2. Fit to observations3. Get posteriors
![Page 24: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/24.jpg)
from pymc import Uniform, rbernoulli, Bernoulli, MCMCfrom matplotlib import pyplot as plt
p_A_true = 0.05N = 1500occurrences = rbernoulli(p_A_true, N)
print 'Click-BUY:'print occurrences.sum()print 'Observed frequency:'print occurrences.sum() / float(N)
Click-BUY:68Observed frequency:0.0453333333333
![Page 25: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/25.jpg)
Clicking BUY
Bernoulli distribution
𝑃 (𝑐𝑙𝑖𝑐𝑘 )={ 𝑝1−𝑝
𝑐𝑙𝑖𝑐𝑘=1𝑐𝑙𝑖𝑐𝑘=0
click=1 click=00
0.10.20.30.40.50.60.70.8
𝑝
![Page 26: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/26.jpg)
p_A = Uniform('p_A', lower=0, upper=1)0 1 P_a
print p_A.random()print p_A.value
array(0.906086144982998)array(0.906086144982998)
print p_A.random()print p_A.value
array(0.285313846133313)array(0.285313846133313)
![Page 27: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/27.jpg)
p_A = Uniform('p_A', lower=0, upper=1)
obs = Bernoulli('obs', p_A, value=occurrences, observed=True)
![Page 28: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/28.jpg)
p_A = Uniform('p_A', lower=0, upper=1)
[------- 20% ] 4053 of 20000 complete in 0.5 sec[------------- 36% ] 7315 of 20000 complete in 1.0 sec[-----------------53% ] 10627 of 20000 complete in 1.5 sec[-----------------69%------ ] 13939 of 20000 complete in 2.0 sec[-----------------81%----------- ] 16376 of 20000 complete in 2.5 sec[-----------------96%---------------- ] 19342 of 20000 complete in 3.0 sec[-----------------100%-----------------] 20000 of 20000 complete in 3.1 sec[ 0.04656576 0.04656576 0.04656576 ..., 0.03803667 0.03803667 0.03803667]
mcmc = MCMC([p_A, obs])mcmc.sample(20000, 1000)
print mcmc.trace('p_A')[:]
obs = Bernoulli('obs', p_A, value=occurrences, observed=True)
![Page 29: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/29.jpg)
plt.figure(figsize=(8, 7))plt.hist(mcmc.trace('p_A')[:], bins=35, histtype='stepfilled', normed=True)plt.xlabel('Probability of clicking BUY')plt.ylabel('Density')plt.vlines(p_A_true, 0, 90, linestyle='--', label='True p_A')plt.legend()plt.savefig('p_A_hist_N_%s.png' % N)plt.show()
![Page 30: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/30.jpg)
Confidence 90% that P is between X and Y?
There is 90% probability that p_A is between 0.0373019596856 and 0.0548052806892
p_A_samples = mcmc.trace('p_A')[:]lower_bound = np.percentile(p_A_samples, 5)upper_bound = np.percentile(p_A_samples, 95)
print 'There is 90%% probability that p_A is between %s and %s' % (lower_bound, upper_bound)
![Page 31: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/31.jpg)
What if N_a is lower?
![Page 32: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/32.jpg)
from pymc import Uniform, rbernoulli, Bernoulli, MCMCfrom matplotlib import pyplot as plt
p_A_true = 0.05N = 50occurrences = rbernoulli(p_A_true, N)
print 'Click-BUY:'print occurrences.sum()print 'Observed frequency:'print occurrences.sum() / float(N)
Click-BUY:2Observed frequency:0.04
![Page 33: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/33.jpg)
p_A = Uniform('p_A', lower=0, upper=1)
obs = Bernoulli('obs', p_A, value=occurrences, observed=True)
mcmc = MCMC([p_A, obs])mcmc.sample(20000, 1000)
print mcmc.trace('p_A')[:]
[----- 14% ] 2874 of 20000 complete in 0.5 sec[----------- 30% ] 6035 of 20000 complete in 1.0 sec[-----------------47% ] 9440 of 20000 complete in 1.5 sec[-----------------63%---- ] 12775 of 20000 complete in 2.0 sec[-----------------81%---------- ] 16203 of 20000 complete in 2.5 sec[-----------------100%-----------------] 20000 of 20000 complete in 3.0 sec[ 0.06240723 0.06240723 0.06240723 ..., 0.01864419 0.01864419 0.01864419]
![Page 34: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/34.jpg)
plt.figure(figsize=(8, 7))plt.hist(mcmc.trace('p_A')[:], bins=35, histtype='stepfilled', normed=True)plt.xlabel('Probability of clicking BUY')plt.ylabel('Density')plt.vlines(p_A_true, 0, 90, linestyle='--', label='True p_A')plt.legend()plt.savefig('p_A_hist_N_%s.png' % N)plt.show()
![Page 35: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/35.jpg)
Confidence 90% that P is between X and Y?
There is 90% probability that p_A is between 0.0160966147705 and 0.114655284797
p_A_samples = mcmc.trace('p_A')[:]lower_bound = np.percentile(p_A_samples, 5)upper_bound = np.percentile(p_A_samples, 95)
print 'There is 90%% probability that p_A is between %s and %s' % (lower_bound, upper_bound)
![Page 36: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/36.jpg)
N_a = 1500 N_a = 50
![Page 37: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/37.jpg)
Does the red have a larger probability of being clicked?
Page A
A Tea Pot
Lorem ipsum dolor sit amet, nemore accusam mel ne, usu offendit delicata id, idque splendide constituam ex vel. Sea in nemore impedit singulis, vivendo sadipscing cum ea. Eum debet torquatos prodesset cu. Mel id mollis comprehensam, nemore verear mei cu.
Mei meis iuvaret vituperata ad, ne cetero iisque singulis eum. Ex magna latine virtute nam, ne graecis dissentias eloquentiam ius. Nam alienum omittam no. Eu vix docendi maiestatis signiferumque, alienum officiis delicata te pri, commodo corrumpit deterruisset eu cum. An mei tincidunt incorrupte dissentias, prompta diceret delenit vis ad.
Sea ad sadipscing intellegebat, quod sumo mea cu, ei eos feugait alienum nominavi. Ei vix simul possit. Recteque tincidunt incorrupte pri no, ipsum constituam eu quo. Per ne populo quodsi persius, molestie efficiantur et his. Munere discere vis id, te sea homero suscipiantur definitionem, quot dicam vis ne.
BUY
Page B
A Tea Pot
Lorem ipsum dolor sit amet, nemore accusam mel ne, usu offendit delicata id, idque splendide constituam ex vel. Sea in nemore impedit singulis, vivendo sadipscing cum ea. Eum debet torquatos prodesset cu. Mel id mollis comprehensam, nemore verear mei cu.
Mei meis iuvaret vituperata ad, ne cetero iisque singulis eum. Ex magna latine virtute nam, ne graecis dissentias eloquentiam ius. Nam alienum omittam no. Eu vix docendi maiestatis signiferumque, alienum officiis delicata te pri, commodo corrumpit deterruisset eu cum. An mei tincidunt incorrupte dissentias, prompta diceret delenit vis ad.
Sea ad sadipscing intellegebat, quod sumo mea cu, ei eos feugait alienum nominavi. Ei vix simul possit. Recteque tincidunt incorrupte pri no, ipsum constituam eu quo. Per ne populo quodsi persius, molestie efficiantur et his. Munere discere vis id, te sea homero suscipiantur definitionem, quot dicam vis ne.
BUY
![Page 38: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/38.jpg)
from pymc import Uniform, rbernoulli, Bernoulli, MCMC, deterministicfrom matplotlib import pyplot as plt
p_A_true = 0.05p_B_true = 0.04N_A = 1500N_B = 750
occurrences_A = rbernoulli(p_A_true, N_A)occurrences_B = rbernoulli(p_B_true, N_B)
print 'Observed frequency:'print 'A'print occurrences_A.sum() / float(N_A)print 'B'print occurrences_B.sum() / float(N_B)
Observed frequency:A0.0533333333333B0.0413333333333
![Page 39: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/39.jpg)
p_A = Uniform('p_A', lower=0, upper=1)p_B = Uniform('p_B', lower=0, upper=1)
@deterministicdef delta(p_A=p_A, p_B=p_B):
return p_A - p_B
obs_A = Bernoulli('obs_A', p_A, value=occurrences_A, observed=True)obs_B = Bernoulli('obs_B', p_B, value=occurrences_B, observed=True)
mcmc = MCMC([p_A, p_B, obs_A, obs_B, delta])mcmc.sample(25000, 5000)[----- 14% ] 3561 of 25000 complete in 0.5 sec[--------- 25% ] 6332 of 25000 complete in 1.0 sec[------------ 33% ] 8454 of 25000 complete in 1.5 sec[--------------- 41% ] 10499 of 25000 complete in 2.0 sec[-----------------50% ] 12602 of 25000 complete in 2.5 sec[-----------------59%-- ] 14780 of 25000 complete in 3.0 sec[-----------------67%----- ] 16883 of 25000 complete in 3.5 sec[-----------------75%-------- ] 18954 of 25000 complete in 4.0 sec[-----------------83%----------- ] 20877 of 25000 complete in 4.5 sec[-----------------91%-------------- ] 22924 of 25000 complete in 5.0 sec[-----------------100%-----------------] 25000 of 25000 complete in 5.5 sec
![Page 40: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/40.jpg)
p_A_samples = mcmc.trace('p_A')[:]p_B_samples = mcmc.trace('p_B')[:]delta_samples = mcmc.trace('delta')[:]
![Page 41: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/41.jpg)
plt.subplot(3,1,1)plt.xlim(0, 0.1)plt.hist(p_A_samples, bins=35, histtype='stepfilled', normed=True, color='blue', label='Posterior of p_A')plt.vlines(p_A_true, 0, 90, linestyle='--', label='True p_A (unknown)')plt.xlabel('Probability of clicking BUY via A')plt.legend()plt.subplot(3,1,2)plt.xlim(0, 0.1)plt.hist(p_B_samples, bins=35, histtype='stepfilled', normed=True, color='green', label='Posterior of p_B')plt.vlines(p_B_true, 0, 90, linestyle='--', label='True p_B (unknown)')plt.xlabel('Probability of clicking BUY via B')plt.legend()plt.subplot(3,1,3)plt.xlim(0, 0.1)plt.hist(delta_samples, bins=35, histtype='stepfilled', normed=True, color='red', label='Posterior of delta')plt.vlines(p_A_true - p_B_true, 0, 90, linestyle='--', label='True delta (unknown)')plt.xlabel('p_A - p_B')plt.legend()plt.savefig('A_and_B.png')plt.show()
![Page 42: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/42.jpg)
![Page 43: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/43.jpg)
p_A > p_BHow much are we confident?
print 'Probability that p_A > p_B:'print (delta_samples > 0).mean()
Probability that p_A > p_B:0.8919
![Page 44: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/44.jpg)
N_A = 1500N_B = 750
N_A = 1500N_B = 200
![Page 45: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/45.jpg)
print 'Probability that p_A > p_B:'print (delta_samples > 0).mean()
Probability that p_A > p_B:0.73455
![Page 46: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/46.jpg)
MCMC
![Page 47: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/47.jpg)
mcmc = MCMC([p_A, p_B, obs_A, obs_B, delta])mcmc.sample(25000, 5000)
Posterior P(p_A, p_B, delta | obs_A, obs_B) as samples
25000 iterations5000 burn-in
Metropolis-Hastings algorithm
![Page 48: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/48.jpg)
Open the black box
mcmc = MCMC([p_A, p_B, obs_A, obs_B, delta])mcmc.sample(25000, 5000)
from pymc.Matplot import plot as mcplot
mcplot(mcmc)
![Page 49: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/49.jpg)
![Page 50: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/50.jpg)
![Page 51: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/51.jpg)
![Page 52: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/52.jpg)
PyMC
• Easy to interpret results– confidence, no p-values!
• No crazy math• Computationally expensive
![Page 53: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/53.jpg)
![Page 55: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/55.jpg)
Back
![Page 56: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/56.jpg)
Serie A 13/14
![Page 57: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/57.jpg)
Date HomeTeam AwayTeam FTHG FTAG FTR HTHG HTAG HTR24/08/2013 Sampdoria Juventus 0 1 A 0 0 D24/08/2013 Verona Milan 2 1 H 1 1 D25/08/2013 Cagliari Atalanta 2 1 H 1 1 D25/08/2013 Inter Genoa 2 0 H 0 0 D25/08/2013 Lazio Udinese 2 1 H 2 0 H25/08/2013 Livorno Roma 0 2 A 0 0 D25/08/2013 Napoli Bologna 3 0 H 2 0 H25/08/2013 Parma Chievo 0 0 D 0 0 D25/08/2013 Torino Sassuolo 2 0 H 1 0 H26/08/2013 Fiorentina Catania 2 1 H 2 1 H31/08/2013 Chievo Napoli 2 4 A 2 2 D31/08/2013 Juventus Lazio 4 1 H 2 1 H01/09/2013 Atalanta Torino 2 0 H 0 0 D01/09/2013 Bologna Sampdoria 2 2 D 1 1 D01/09/2013 Catania Inter 0 3 A 0 1 A01/09/2013 Genoa Fiorentina 2 5 A 0 3 A01/09/2013 Milan Cagliari 3 1 H 2 1 H01/09/2013 Roma Verona 3 0 H 0 0 D01/09/2013 Sassuolo Livorno 1 4 A 0 1 A01/09/2013 Udinese Parma 3 1 H 1 0 H14/09/2013 Inter Juventus 1 1 D 0 0 D14/09/2013 Napoli Atalanta 2 0 H 0 0 D14/09/2013 Torino Milan 2 2 D 0 0 D15/09/2013 Fiorentina Cagliari 1 1 D 0 0 D
https://datahub.io/dataset/italian-football-data-serie-a-b
![Page 58: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/58.jpg)
Win-rate
Did it change?
![Page 59: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/59.jpg)
Bayesian Worflow
1. Define Prior2. Fit to observations3. Get Posteriors
![Page 60: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/60.jpg)
Winning a Match
Bernoulli distribution
𝑃 (𝑤 )={ 𝑝1−𝑝
𝑤=1𝑤=0
Win (w=1) Lose (w=0)0
0.10.20.30.40.50.60.70.8
𝑝
![Page 61: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/61.jpg)
𝑝 : switchpoint?
![Page 62: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/62.jpg)
Model the switchpoint
𝑝={𝑝1𝑝2 𝑡<𝜏𝑡≥𝜏
Goal -> infer
![Page 63: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/63.jpg)
Bayesian Worflow
1. Define Prior2. Fit to observations3. Get Posteriors
![Page 64: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/64.jpg)
Let’s model this
• goal: infer unknown p1, p2, TAU• FIRST STEP OF Bayesian Inference: assign a prior
probability to different possible values of p• what would be a good prior for p1, p2? Use
uniform:– p1 ~ Uniform(0,1)– p2 ~ Uniform(0,1)– TAU ~ DiscreteUniform(1, 38)
• P(TAU=k)=1/38 for all k
![Page 65: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/65.jpg)
from pymc import Uniform, DiscreteUniform, deterministic, Bernoulli, Model, MCMC
p_1 = Uniform('p_1', lower=0, upper=1)p_2 = Uniform('p_2', lower=0, upper=1)tau = DiscreteUniform('tau', lower=1, upper=38)
print 'Random output: ', tau.random(), tau.random(), tau.random()
Random output: 14 24 33
@deterministicdef p_(tau=tau, p_1=p_1, p_2=p_2, num_matches=38): # concatenate p_1 and p_2 based on tau out = np.empty(num_matches) out[:tau] = p_1 out[tau:] = p_2 return out
![Page 66: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/66.jpg)
Load Data
import pandas as pd
df = pd.read_csv('serie_a.csv', parse_dates=['Date'], date_parser=parse_date)
matches = df[(df.HomeTeam == ‘Milan’) | (df.AwayTeam == ‘Milan’)]matches = matches.set_index(['Date'])matches = compute_extra_columns(matches, team)# some pandas manipulations occur herematches[‘Win’] = … # 1 if Milan won, 0 otherwise
![Page 67: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/67.jpg)
Fit the Model
observed_matches = Bernoulli('obs', p=p_, value=matches[['Win']], observed=True)
model = Model([observed_matches, p_1, p_2, tau])mcmc = MCMC(model)mcmc.sample(40000, 10000)
p_1_samples = mcmc.trace('p_1')[:]p_2_samples = mcmc.trace('p_2')[:]tau_samples = mcmc.trace('tau')[:]
print p_1_samples[:10]print p_2_samples[:10]print tau_samples[:10][ 0.42067236 0.42067236 0.42067236 0.43900391 0.43900391 0.43900391 0.43900391 0.43900391 0.43900391 0.43900391][ 0.49213381 0.49213381 0.49213381 0.56072562 0.79863176 0.79863176 0.67416932 0.68382528 0.6069458 0.60062698][10 10 24 35 35 35 35 27 27 27]
![Page 68: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/68.jpg)
plt.figure(figsize=(14.5, 10))ax = plt.subplot(311)ax.set_autoscaley_on(False)plt.hist(p_1_samples, histtype='stepfilled', alpha=0.85, label='posterior of p_1', color='#A60628', normed=True, bins=30)plt.legend(loc='upper left')ax = plt.subplot(312)plt.hist(p_2_samples, histtype='stepfilled', alpha=0.85, label='posterior of p_2', color='#7A68A6', normed=True, bins=30)plt.legend(loc='upper left')ax = plt.subplot(313)plt.hist(tau_samples, histtype='stepfilled', alpha=0.85, label='posterior of tau', color='#467821', normed=True, bins=30)plt.legend(loc='upper left')plt.show()
![Page 69: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/69.jpg)
![Page 70: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/70.jpg)
Expected Win Probability
num_matches = 38N = tau_samples.shape[0]expected_p_per_match = np.zeros(num_matches)for match in range(num_matches): ix = match < tau_samples p_samples_match = np.concatenate([p_1_samples[ix], p_2_samples[~ix]]) expected_p_per_match[match] = np.percentile(p_samples_match, 50)
![Page 71: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/71.jpg)
![Page 72: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/72.jpg)
Compute Confidence Bounds
lower_p_per_match = np.zeros(num_matches)upper_p_per_match = np.zeros(num_matches)for match in range(num_matches): ix = match < tau_samples p_samples_match = np.concatenate([p_1_samples[ix], p_2_samples[~ix]]) lower_p_per_match[match] = np.percentile(p_samples_match, 5) upper_p_per_match[match] = np.percentile(p_samples_match, 95)
![Page 73: Applied Bayesian Inference with PyMC](https://reader034.fdocuments.net/reader034/viewer/2022051503/589b3c811a28ab22038b6265/html5/thumbnails/73.jpg)
Bayesian returns a distribution. What have we gained? We see uncertainty in our estimates. The wider the distribution, the less certain our posterior belief should be.