Post on 24-Feb-2016
description
• Started in January 1996 by Larry Page, and soon after by Sergey Brin– 2 students at University of CA
• Originally started on Stratford University website– Registered google.com on September 15th, 1997
Background
• Search Google for random words of different lengths• Compare number of letters vs. search time• Compare number of results vs. search time
What We Did
• Picked random words from Wordox dictionary• Searched for just that word on Google home
page• Copied down number of results and time it
took for the search
Procedure
• Assumed
• Assumed
Assumptions• 2 independent SRS
• Actual relationship is linear
cub 18,100,000 0.2cab 105,000,000 0.38mib 23,300,000 0.22hap 15,400,000 0.12oke 14,900,000 0.35rya 308,000 0.13tic 70,000,000 0.06
ben 561,000,000 0.22rev 211,000,000 0.08mat 158,000,000 0.35ting 53,900,000 0.22 jeux 38,200,000 0.28shah 40,100,000 0.23hype 34,500,000 0.11nobs 1,990,000 0.22pong 33,000,000 0.06thae 3,500,000 0.28balm 14,400,000 0.24span 118,000,000 0.18deme 36,200,000 0.15
carex 1,870,000 0.07bends 12,600,000 0.18jisms 4,770 0.21doles 1,060,000 0.19tawse 338,000 0.26spahi 160,000 0.24ashes 26,900,000 0.1chare 97,400,000 0.24borax 2,320,000 0.05prang 2,480,000 0.07
pontificals 43,700 0.19trichromats 26,500 0.21supplanters 2,620,000 0.29prebiologic 221,000 0.37regenerated 3,610,000 0.34cheesecloth 169,000 0.11memorialize 661,000 0.31preinvasion 576,000 0.25binucleated 51,100 0.31deceitfully 360,000 0.29
usufructuary 181,000 0.24appropriates 1,100,000 0.48circumfluent 50,000 0.22forgeability 41,700 0.19
misericordes 212,000 0.23retrogresses 583,000 0.26futurologies 784,000 0.3consultative 11,700,000 0.13
temperaments 1,790,000 0.22prefabricate 5,400,000 0.28subliterature 60,600,000 0.37archegoniates 64,300 0.24antidiarrheal 377,000 0.2
misconceiving 945,000 0.4preparatively 843,000 0.35chalcogenides 339,000 0.2interpleading 85,300 0.26
nonelementary 32,800,000 0.31stratocracies 1,740 0.2concentrating 24,000,000 0.22
Data
Results Part 1
0 100,000,000 200,000,000 300,000,000 400,000,000 500,000,000 600,000,0000
0.1
0.2
0.3
0.4
0.5
0.6
f(x) = − 5.4219777638384E-11 x + 0.229334978599462R² = 0.00220005112689536
Pretty scatteredVery weak relationship
Ho: β=0Ha: β>0
t = b/SEb = -.3576p(t>-.3576) = .6390
We fail to reject Ho because our p-value of .6390 is greater than alpha = .05. We have sufficient evidence that the slope of the population regression line for search results and time on Google is equal to zero. Therefore, as number of results increases, the time of search stays the same.
Line Test
Confidence IntervalConfidence level: 95%
b ± t* SEb
(-4 x10-10 , 2 x10-10)
We are 95% confident that the slope of the population regression line lies between -4 x10-10
and 2 x10-10 seconds per result. Because zero is within the interval, we can say that there is no relationship between number of results and search time.
Results Part 2
2 4 6 8 10 12 140
0.1
0.2
0.3
0.4
0.5
0.6
f(x) = 0.0087 x + 0.158066666666667R² = 0.147407849807196
Moderate StrengthPositiveLinear
Ho: β=0Ha: β>0
t = b/SEb = 3.1667p(t>3.1667) = .00123
We reject Ho for Ha because our p-value of .00123 is less than alpha = .05. We have sufficient evidence that the slope of the population regression line for number of letters and search time on Google is greater than zero. Therefore, as number of letters in the search increases, the search time increases.
Line Test
Confidence level: 95%
b ± t* SEb
(.0032, .0142)
We are 95% confident that the slope of the population regression line lies between .0032 and .0142 seconds per character.
Confidence Interval
Google vs. Yahoo
•Compare number of search results on Google and Yahoo•1 Samp-t test for matched pairs
Data
pontificals -800trichromats -13,900supplanters 1,647,000prebiologic 219,960regenerated -5,130,000cheesecloth -2,751,000memorialize -3,379,000preinvasion 518,900binucleated -36,900deceitfully -760,000
usufructuary 16,000appropriates -1,680,000circumfluent 11,600forgeability -16,100
misericordes -16,000
retrogresses 557,600futurologies 780,120consultative -27,100,000
temperaments -3,520,000prefabricate 4,437,000subliterature 60,583,000
archegoniates 61,170antidiarrheal -432,000
misconceiving 875,000preparatively 827,900chalcogenides 101,000interpleading 73,900
nonelementary 32,770,700stratocracies -580concentrating -36,100,000
Google results – Yahoo results
0 5 10 15 20 25 30 35
-60,000,000
-40,000,000
-20,000,000
0
20,000,000
40,000,000
60,000,000
80,000,000
Results Part 3
Graph of difference in Google results vs. Yahoo results (Google – Yahoo)
Ho: µd = 0Ha: µd > 0
t = .2806p(t > .2806|df=29) = .3905
We fail to reject Ho because our p-value of .3905 is greater than alpha = .05. We have sufficient evidence that the mean difference of Google results vs. Yahoo is equal to zero.
1-samp t test (matched pairs)
nSddx
t =
Don’t always get the same search time every time you search for the same word.
Internet connection may not be the same during all tests
More people searching may affect the search time
Sources of error
The number of characters you search for in Google can slightly affect the search time
The number of results found on Google may have no effect on the search time
Google and Yahoo do not have a significant difference in number of results
Conclusions