Started in January 1996 by Larry Page, and soon after by Sergey Brin

18

description

Background. Started in January 1996 by Larry Page, and soon after by Sergey Brin 2 students at University of CA Originally started on Stratford University website Registered google.com on September 15 th , 1997. What We D id. Search Google for random words of different lengths - PowerPoint PPT Presentation

Transcript of Started in January 1996 by Larry Page, and soon after by Sergey Brin

Page 1: Started  in January 1996 by Larry Page, and soon after by Sergey  Brin
Page 2: Started  in January 1996 by Larry Page, and soon after by Sergey  Brin

• Started in January 1996 by Larry Page, and soon after by Sergey Brin– 2 students at University of CA

• Originally started on Stratford University website– Registered google.com on September 15th, 1997

Background

Page 3: Started  in January 1996 by Larry Page, and soon after by Sergey  Brin

• Search Google for random words of different lengths• Compare number of letters vs. search time• Compare number of results vs. search time

What We Did

Page 4: Started  in January 1996 by Larry Page, and soon after by Sergey  Brin

• Picked random words from Wordox dictionary• Searched for just that word on Google home

page• Copied down number of results and time it

took for the search

Procedure

Page 5: Started  in January 1996 by Larry Page, and soon after by Sergey  Brin

• Assumed

• Assumed

Assumptions• 2 independent SRS

• Actual relationship is linear

Page 6: Started  in January 1996 by Larry Page, and soon after by Sergey  Brin

cub 18,100,000 0.2cab 105,000,000 0.38mib 23,300,000 0.22hap 15,400,000 0.12oke 14,900,000 0.35rya 308,000 0.13tic 70,000,000 0.06

ben 561,000,000 0.22rev 211,000,000 0.08mat 158,000,000 0.35ting 53,900,000 0.22 jeux 38,200,000 0.28shah 40,100,000 0.23hype 34,500,000 0.11nobs 1,990,000 0.22pong 33,000,000 0.06thae 3,500,000 0.28balm 14,400,000 0.24span 118,000,000 0.18deme 36,200,000 0.15

carex 1,870,000 0.07bends 12,600,000 0.18jisms 4,770 0.21doles 1,060,000 0.19tawse 338,000 0.26spahi 160,000 0.24ashes 26,900,000 0.1chare 97,400,000 0.24borax 2,320,000 0.05prang 2,480,000 0.07

pontificals 43,700 0.19trichromats 26,500 0.21supplanters 2,620,000 0.29prebiologic 221,000 0.37regenerated 3,610,000 0.34cheesecloth 169,000 0.11memorialize 661,000 0.31preinvasion 576,000 0.25binucleated 51,100 0.31deceitfully 360,000 0.29

usufructuary 181,000 0.24appropriates 1,100,000 0.48circumfluent 50,000 0.22forgeability 41,700 0.19

misericordes 212,000 0.23retrogresses 583,000 0.26futurologies 784,000 0.3consultative 11,700,000 0.13

temperaments 1,790,000 0.22prefabricate 5,400,000 0.28subliterature 60,600,000 0.37archegoniates 64,300 0.24antidiarrheal 377,000 0.2

misconceiving 945,000 0.4preparatively 843,000 0.35chalcogenides 339,000 0.2interpleading 85,300 0.26

nonelementary 32,800,000 0.31stratocracies 1,740 0.2concentrating 24,000,000 0.22

Data

Page 7: Started  in January 1996 by Larry Page, and soon after by Sergey  Brin

Results Part 1

0 100,000,000 200,000,000 300,000,000 400,000,000 500,000,000 600,000,0000

0.1

0.2

0.3

0.4

0.5

0.6

f(x) = − 5.4219777638384E-11 x + 0.229334978599462R² = 0.00220005112689536

Pretty scatteredVery weak relationship

Page 8: Started  in January 1996 by Larry Page, and soon after by Sergey  Brin

Ho: β=0Ha: β>0

t = b/SEb = -.3576p(t>-.3576) = .6390

We fail to reject Ho because our p-value of .6390 is greater than alpha = .05. We have sufficient evidence that the slope of the population regression line for search results and time on Google is equal to zero. Therefore, as number of results increases, the time of search stays the same.

Line Test

Page 9: Started  in January 1996 by Larry Page, and soon after by Sergey  Brin

Confidence IntervalConfidence level: 95%

b ± t* SEb

(-4 x10-10 , 2 x10-10)

We are 95% confident that the slope of the population regression line lies between -4 x10-10

and 2 x10-10 seconds per result. Because zero is within the interval, we can say that there is no relationship between number of results and search time.

Page 10: Started  in January 1996 by Larry Page, and soon after by Sergey  Brin

Results Part 2

2 4 6 8 10 12 140

0.1

0.2

0.3

0.4

0.5

0.6

f(x) = 0.0087 x + 0.158066666666667R² = 0.147407849807196

Moderate StrengthPositiveLinear

Page 11: Started  in January 1996 by Larry Page, and soon after by Sergey  Brin

Ho: β=0Ha: β>0

t = b/SEb = 3.1667p(t>3.1667) = .00123

We reject Ho for Ha because our p-value of .00123 is less than alpha = .05. We have sufficient evidence that the slope of the population regression line for number of letters and search time on Google is greater than zero. Therefore, as number of letters in the search increases, the search time increases.

Line Test

Page 12: Started  in January 1996 by Larry Page, and soon after by Sergey  Brin

Confidence level: 95%

b ± t* SEb

(.0032, .0142)

We are 95% confident that the slope of the population regression line lies between .0032 and .0142 seconds per character.

Confidence Interval

Page 13: Started  in January 1996 by Larry Page, and soon after by Sergey  Brin

Google vs. Yahoo

•Compare number of search results on Google and Yahoo•1 Samp-t test for matched pairs

Page 14: Started  in January 1996 by Larry Page, and soon after by Sergey  Brin

Data

pontificals -800trichromats -13,900supplanters 1,647,000prebiologic 219,960regenerated -5,130,000cheesecloth -2,751,000memorialize -3,379,000preinvasion 518,900binucleated -36,900deceitfully -760,000

usufructuary 16,000appropriates -1,680,000circumfluent 11,600forgeability -16,100

misericordes -16,000

retrogresses 557,600futurologies 780,120consultative -27,100,000

temperaments -3,520,000prefabricate 4,437,000subliterature 60,583,000

archegoniates 61,170antidiarrheal -432,000

misconceiving 875,000preparatively 827,900chalcogenides 101,000interpleading 73,900

nonelementary 32,770,700stratocracies -580concentrating -36,100,000

Google results – Yahoo results

Page 15: Started  in January 1996 by Larry Page, and soon after by Sergey  Brin

0 5 10 15 20 25 30 35

-60,000,000

-40,000,000

-20,000,000

0

20,000,000

40,000,000

60,000,000

80,000,000

Results Part 3

Graph of difference in Google results vs. Yahoo results (Google – Yahoo)

Page 16: Started  in January 1996 by Larry Page, and soon after by Sergey  Brin

Ho: µd = 0Ha: µd > 0

t = .2806p(t > .2806|df=29) = .3905

We fail to reject Ho because our p-value of .3905 is greater than alpha = .05. We have sufficient evidence that the mean difference of Google results vs. Yahoo is equal to zero.

1-samp t test (matched pairs)

nSddx

t =

Page 17: Started  in January 1996 by Larry Page, and soon after by Sergey  Brin

Don’t always get the same search time every time you search for the same word.

Internet connection may not be the same during all tests

More people searching may affect the search time

Sources of error

Page 18: Started  in January 1996 by Larry Page, and soon after by Sergey  Brin

The number of characters you search for in Google can slightly affect the search time

The number of results found on Google may have no effect on the search time

Google and Yahoo do not have a significant difference in number of results

Conclusions