“PageRank Uncovered” › PageRank.pdf · PageRank is linear, Google has chosen to use a...
Transcript of “PageRank Uncovered” › PageRank.pdf · PageRank is linear, Google has chosen to use a...
ldquoPageRank UncoveredrdquoFormerly PageRank Explained
copy2002 All Rights Reserved
Written and theorised by Chris Ridings and Mike Shishigin
Edited by Jill Whalen
Technical Editing by Yuri Baranov
Profiles
Chris Ridings is the original author of PageRank Explained A former programmer of custom searchengines search engine expert creator of the search engine optimization support forums(httpwwwsupportforumsorg) and author of marketing software tools (httpwwwiebarcom) Chris useshis programming experience and knowledge of algorithms to get an insight into how search engines workldquounder the coverrdquo
Mike Shishigin PhD is a brilliant search engine expert whose philosophy is to always back up all his wordswith hard facts derived from deep knowledge of applied mathematics As Director of Technology atRadiocom Ltd Mike uses methods of sampling analysis to get into the nature of things providing users ofWebSite CEO with an outstanding Web site promotion suite (httpwwwwebsiteceocom) thus providingthem with the best possible results in the search engines Mike controls all RampD and search engine studieswithin WebSite CEO His latest hobby is the PageRank technology because it is similar to graph theoryhis favourite university course His strongest dislike is search engine spam which he considers the deadlysin of Internet age Mike mercilessly fights it by enlightening minds of WebSite CEO users and makingregular raids into spammers camps
Edited by Jill Whalen owner of High Rankings and moderator of the High Rankings Advisor free weeklyemail newsletter (httpwwwhighrankingscomadvisorhtm) Jill has been practicing successful searchengine optimisation since 1995 for a wide variety of clients Rather than worry about PageRank Jill prefersto help sites be the best they can be which in turn garners them lots of inbound links (and thus highPageRank)
Technical Editing by Yuri Baranov Yuri is Director of Marketing at Radiocom Ltd He concentrates ongeneral Web site promotion and search engine optimization Yuri has a groundbreaking insight into searchengine marketing and does all in his power to make it more consistent and comprehensible to ordinarypeople He currently coordinates all search engine studies sets goals and evaluates results He is alsointerested in human engineering and usability issues Yuri agreed to become the technical editor of thispaper because he is head over hills in love with Google which he considers the best search engine ever
Reproduction of this document in whole or in part is permitted solely with the prior consent of the authorsldquoFair Userdquo reproduction is allowed however you should think carefully if you need to reproduce more than a
few paragraphs Educational Institutions are allowed to copy this document freely
VERSION 30 Last amended ndash September 2002
Introduction 2What is PageRank 2How is PageRank determined 3How can you tell what a pagersquos PageRank is 3How accurate is the Google toolbar 3How significant is PageRank 5Is PageRank a good determination of the quality of a page 6Googlersquos First 1000 Results 7The difference between PageRank and other factors 8Non-PageRank Factor Threshold 8Using the threshold to derive the worth of two ranking strategies 9Really heavy competition 11How PageRank is calculated 12A word on convergence 16Derived specifics of how Google calculates PageRank 18PageRank Feedback how linking sometimes helps 19Influencing the results 22Controlling PageRank 22Links to Your Site 23Links out of your site 24Internal Structures and Linkages 28What Google says 34Final word to non mathematicians 35Advanced PageRank topics 36Speeding Up PageRank 36How PageRank could be more than it seems 42Favouring pages in directories 44Examples 45Mathematical aspects of PageRank for advanced readers 45Further Reading and Resources 54Technical Proof Readers credits 555
2
Introduction
The original PageRank Explained document has existed for a long time It wasthe first of its kind to present many of the ideas in a way that the average personcan understand You may be wondering why there is a need for a new version ofthis document The reasoning is simple everything in the original document wastheoretical and subject to change These changes have occurred through theauthorsrsquo further learning on how Google reacts and how we think about thesubject To this end you will note that an additional author and technical editorhas been added to the paper Mike Shishiginrsquos extensive mathematical abilityenables us to explore some new areas of PageRank not mentioned before aswell as provide a section at the end that contains more advanced informationWe hope to present PageRank in a way that gets across everything you couldpossibly need to know -- without baffling you However if you want to know theextremely technical stuff then wersquoll give you that at the end
Were pleased that the principles in the original PageRank Explained documentseem to have been accepted universally within the search engine optimizationfield The one document that appears to criticise it later explains why each ofthe points it makes are valid Were also pleased that since its publication a vastamount of additional articles have been written about PageRank
Despite all this things have changed and there are still a lot ofmisunderstandings regarding the importance (or unimportance) of PageRankWersquove added new sections to address these and new sections to address somenew things Weve also tried to make the existing sections moreunderstandable This is a long document itrsquos written so that a broad range ofpeople can understand it As itrsquos also used as a major reference in severaluniversity courses we will do our best detail as much as possible If you are ldquoJoeAveragerdquo then you can simply limit your concern to understanding the principlesuntil youre ready to dig deeper
Finally we thought it was time to dedicate some space to Googles thoughtsabout PageRank Obviously Google is not going to comment on exact rankingissues but as it is their technology what they say about it gives us lots of insightWe asked Barry Schnitt spokesperson for Google a few questions about variousPageRank related topics and he was kind enough to give us some answersyoursquoll find those answers later in the document
What is PageRank
PageRank is Googles method of measuring a pages importance When allother factors such as Title tag and keywords are taken into account Google usesPageRank to adjust results so that sites that are deemed more important willmove up in the results page of a users search accordingly
3
A basic overview of how Google ranks pages in their search engine resultspages (SERPS) follows
1) Find all pages matching the keywords of the search2) Rank accordingly using on the page factors such as keywords3) Calculate in the inbound anchor text4) Adjust the results by PageRank scores
In reality itrsquos slightly more complex and wersquoll discuss this in more depth later butfor now the above description serves our purposes Itrsquos worth noting thatPageRank is a multiplier and is not just simply added to the score Thus if yourpage had a PageRank of zero it would rank at the very end of the SERPS
How is PageRank determined
The Google theory goes that if Page A links to Page B then Page A is sayingthat Page B is an important page PageRank also factors in the importance of thelinks pointing to a page If a page has important links pointing to it then its linksto other pages also become important The actual text of the link is irrelevantwhen discussing PageRank
How can you tell what a pagersquos PageRank is
To learn what a pagersquos PageRank is you can download a toolbar for InternetExplorer from httptoolbargooglecom Once installed there will be a bar graphat the top of the browser showing a version of PageRank for the page yoursquorebrowsing When you hold the mouse over the bar you see a number from zero toten (If you donrsquot see the number you may have an older version of the toolbarinstalled You will need to completely uninstall it reboot your computer andreinstall the latest version Once this is done you should be able to see thePageRank number)
How accurate is the Google toolbar
The Google toolbar is not very accurate in showing you the actual PageRank of asite but itrsquos the only thing right now that can give you any idea As long as youknow the toolbarrsquos limitations then at least you know what you are viewing
There are two limitations to the Google toolbar
1 The toolbar sometimes guesses If you enter a page which is not inits index but where there is a page that is very close to it in Googlersquosindex then it will provide a guesstimate of the PageRank This
4
guesstimate is worthless for our purposes because it isnrsquot featured inany of the PageRank calculations The only way to tell if the toolbar isa guesstimate is to type the URL into the Google search box and see ifthe page shows up in the SERPS If it doesnrsquot then the toolbar isguessing
2 The toolbar is just a representation of actual PageRank WhilstPageRank is linear Google has chosen to use a non-linear graph toportray it So on the toolbar to move from a PageRank of 2 to aPageRank of 3 takes less of an increase than to move from aPageRank of 3 to a PageRank of 4 A comparison table best illustratesthis phenomenon The actual figures are kept secret so wersquoll just useany figures for demonstration purposes
If the actualPageRank is
betweenThe Toolbar Shows
000000001 and 5 16 and 25 2
25 and 125 3126 and 625 4
626 and 3125 53126 and 15625 6
15626 and 78125 778126 and 390625 8
390626 and 1953125 91953126 and infinity 10
The PageRank shown in the Google directory (httpdirectorygooglecom) suffers fromthe same problems The PageRank shown in the directory is also on a differentscale There have been attempts to cross-reference these two scales butbecause they are non-linear the results really do not tell you anything more thanyou already know
Also of note is that a programmer managed to generate a tool to look upPageRank without using Internet Explorer This tool has since been withdrawnbut whilst originally the numbers given by this software and Googlersquos toolbarmatched - presently querying with such software sometimes produces differentnumbers than querying with the toolbar This is Googlersquos right to protect theirdata but is the strongest indication that
5
Sometimes what you see on the Google toolbar may not be related to actualPageRank at all (Google can and does assign whatever toolbar PageRankvalue they want to assign to a page)
How significant is PageRank
The significance of any one factor in search engine algorithms depends on thequality of the information it supplies A factorrsquos importance is known as its weightTo demonstrate how weighting is arrived at its easiest to move away fromPageRank for a second and look at Meta tags Originally when the Metakeyword tag was new you could write something like this in your document
ltmeta name=rdquokeywordsrdquo content=rdquopagerank pagerank uncovered algorithmalgorithmsrdquogt
In theory the Meta keyword tag was a very good indicator of what the page wasabout However as most are well aware ndash the weighting for the keywords tag isfast approaching nothing Two things have contributed to this
1 The ease at which Webmasters can manipulate it2 The level of manipulation by Webmasters
These two things are separate factors but with human nature being what it isthe easier something is to influence ndash the more it is manipulated Thecombination of these factors determines the ldquoweightingrdquo ndash ie how much we trustthe information provided by that factor
So it makes sense to look at these factors in relation to PageRank first
PageRank is without doubt one of the hardest things for a Webmaster tomanipulate ethically However it is possible to generate links to your site fromother sites fairly simply through the use of link farms and guestbooks Googlefrowns upon this kind of abuse and many sites that have tried this have had theirPageRank influence blocked But it must be said that the abuse is still rampantand that it can have an influence on PageRank So whilst not easy to doPageRank is still subject to manipulation
The extent to which PageRank is manipulated has also changed Most people nolonger believe Googlersquos old line of people not being able to influence PageRankand the results based on it However there is more information about PageRankavailable than ever and people are more aware of manipulation techniques
So whilst PageRank is valuable you should be careful not to over-estimate itsusage and capabilities Your final ranking in Google is due to a mix of factors of
6
which PageRank is only one Wersquoll get into more details later by discussing howPageRank is different than the other ranking factors and thus when it appliesand when it doesnrsquot Ironically enough PageRankrsquos weighting factor isundeniably declining Since the original version of this paper gave out detailedinformation about PageRank in all likelihood it may also have contributed insome small way to the decline in weighting of the very subject it talks about
Is PageRank a good determination of the quality of a page
To examine the worth of PageRank we need to first look at its premise and howaccurate it is Basically PageRank says
1 If a page links to another page it is casting a vote which indicates that theother page is good
2 If lots of pages link to a page then it has more votes and its worth shouldbe higher
The basic implication here is People only link to pages they think are good
It shouldnt be hard to convince you that this premise is wrong A few of thereasons people link to pages other than ones they think are good are
1) Reciprocal links ndash ldquoLink to me and Irsquoll link to yourdquo
2) Link Requirements ndash ldquoUsing our script requires you to put a link to ourpagerdquo or ldquoWersquoll give you an award solely because you link to our pagerdquo
3) Friends and Family ndash ldquoThis is my friend Petersquos siterdquo or ldquoMy mumrsquos siteis here my dadrsquos site is here My dogrsquos site is hererdquo
4) Free Page Add-ons ndash ldquoThis counter was provided bywwwlinktocountersitecomrdquo
Furthermore anybody who has a top-ranking site will tell you that it tends to getlinks from new sites This is not necessarily because its good (although theygenerally are) Assume a Webmaster is setting up a new site and they arelooking for some outbound links Nowadays one of the first things they do is aGoogle search for similar sites The links they end up with may not necessarilybe the best sites but merely the easiest ones to find If PageRank influencesrankings and if they subsequently link to those pages ndash the new Webmaster willbe adding to the inaccuracies in the judging of the quality of a page The same istrue when these new Webmasters use the Google Toolbar PageRank indicator tochoose whom to link to
7
To put this another way
PageRank is determined by the links pointing to a page But if PageRank itselfhas an influence on the number of links to a page it is influencing (in acircular way) the quality of that page The links are no longer based solely onhuman judgement If a Webmaster picks their outbound links by searching onGoogle or by looking at the Google toolbar (even if this is only part of theirjudgement) then there is a corresponding increase in a pagersquos PageRankThis increase is not solely because it is a good page but because itsPageRank is already high
Googlersquos First 1000 Results
Remember PageRank alone cannot get you high rankings Wersquove mentionedbefore that PageRank is a multiplier so if your score for all other factors is 0 andyour PageRank is twenty billion then you still score 0 (last in the results) This isnot to say PageRank is worthless but there is some confusion over whenPageRank is useful and when it is not This leads to many misinterpretations ofits worth The only way to clear up these misinterpretations is to point out whenPageRank is not worthwhile
If you perform any broad search on Google it will appear as if youve foundseveral thousand results However you can only view the first 1000 of themUnderstanding why this is so explains why you should always concentrate onldquoon the pagerdquo factors and anchor text first and PageRank last
Assume that you perform a search on Google and it returns 200000 results Ifwe were to calculate every factor for each 200000 pages ndash do you think it wouldreally take just 034 seconds to search The answer to speeding up the search isto get a subset of documents that are most likely to be related to the query Thissubset of documents needs to be larger than the number of search results Forexample letrsquos say that number is 2000 What the search engine does is querythe whole database using 2 or 3 factors finding the 2000 documents that rankhighest for them (Remember there were 200000 possible documents andthatrsquos the number that actually gets shown) Then the engine applies all thefactors to those 2000 and ranks them accordingly Because therersquos a drop in thequality of the results (not the pages) at the bottom of this subset the engine justshows the first 1000 PageRank is almost certainly not one of those factorsNotice how before we highlighted the word ldquorelatedrdquo in creating the subset of2000 pages The search engine is looking for pages that are on-topic If weincluded PageRank in that list wersquod get a lot of high PageRank pages with topicsthat are only slightly related (because of the second factor) but thatrsquos not whatwe want
8
Why this is critical
You must do enough on the page work andor anchor text work to get intothat subset of 2000 pages for your chosen key phrase otherwise your highPageRank will be completely in vain PageRank means nothing if you do nothave enough ranking from other factors to make it into the first subset
The difference between PageRank and other factors
To assess when PageRank is important and when it is not we need tounderstand how PageRank is different from all other ranking factors To do thisherersquos a quick table that lists a few other factors and how they add to the rankingscore
Title Tag Can only be listed onceKeywords in Body Text Each successive repetition is
less important Proximity isimportant
Anchor Text Highly weighted but likekeywords in body text there isa cut off point where furtheranchor text is no longerworthwhile
PageRank Potentially infinite You arealways capable of increasingyour PageRank significantlybut it takes work
All other ranking factors have cut off points beyond which they will no longeradd to your ranking score or will not add significantly enough for it to beworthwhile PageRank has no cut off point
Non-PageRank Factor Threshold
Having written about the difference between PageRank and other factors andhow PageRank is harder to get it should now be clear that whilst we could usemany methods to get good rankings there is a threshold which defines whenhigh PageRank is worth striving for and when it is not
With ranking factors other than PageRank there is a score beyond which theslow down in the rate that any factor adds to this score is so insignificant that it is
9
not worthwhile This is the Non-PageRank Factor Threshold To illustrate thisletrsquos put an example figure on this of 1000
If we have a query where the results are Page A and Page B then Page A and Bhave scores for that query which are the total scores for all ranking factors(including PageRank) Letrsquos say Page Arsquos score is 900 and page Brsquos score is500 Obviously Page A will be listed first These are both below our hypotheticalNon-PageRank Factor Threshold thus without any change in PageRank it ispossible for page B to improve their optimization to beat Page A for this particularquery There are lots of queries like this on Google theyre more commonlythought of as less competitive queries
Now assume Page A raises its score to 1100 Suddenly page B cannot competein the SERPs (search engine results pages) without increasing its PageRank Inall probability page B must also improve for all the other ranking factors but anincrease in PageRank is almost certainly necessary There are also lots ofqueries like this on Google which are more commonly thought of as morecompetitive queries
Generally when querying Google the group of pages in the SERPs will containsome pages that have a score above the Non-PageRank Factor Threshold andsome that do not
There is an important point to be made here
To be competitive you must raise your pages search engine ranking scorebeyond the Non-PageRank Factor Threshold To fail to do so means that youcan easily be beaten in the search results for your query terms The quickestway to approach the Non-PageRank Factor Threshold is through ldquoon the pagefactorsrdquo however you cannot move above the Non-PageRank FactorThreshold without PageRank
The obvious question is whatrsquos the numerical value of the Non-PageRank FactorThreshold and how much work do you need to do to get past it The answer isthat it has no value it is a hypothetical line Google could put a value on it butthat would not help us unless we know what the pagersquos individual scores are Weneed only be aware that the threshold exists and that it gives us informationabout principles
Using the threshold to derive the worth of two ranking strategies
The threshold explains principles and the different ways that search enginemarketers work It also demonstrates why some of the misunderstandings aboutPageRank occur Letrsquos consider the strategies of two people Person A considersPageRank to be unimportant and Person B considers PageRank to be veryimportant
10
Person A says ldquoPageRankrdquo is unimportant They have optimised pages foryears and know how to use ldquoon the pagerdquo factors very successfully Theyunderstand the basics of anchor text but they couldnrsquot care at all aboutPageRank
Whatrsquos happening person A is reaching the Non-PageRank Factor Thresholdvery quickly because they are maximising the ldquoon the pagerdquo factors Throughcarefully choosing keywords they jump-start themselves up the SERPs Aslong as their content is good high-ranking sites (over time) tend to get linkedto Whilst they didnrsquot directly ask for it a slow trickle of sites will begin to link tothem and give them PageRank which helps consolidates their position
Person B says ldquoPageRankrdquo is important Wersquove all seen those pages in theresults that have no content but great rankings (With big brands this canoften occur naturally even when they have no idea what PageRank is Thiswould be Person C who is not relevant to the discussion at hand) Person Bunderstands lots about PageRank and concentrates heavily on it
Whatrsquos happening person B is doing the reverse of person A Whilst person Aconcentrated on the Non-PageRank factors and found herself gettingPageRank anyway person B concentrates on the PageRank Factor and findshimself getting Non-PageRank factors The reason for this is that increasingPageRank requires links and links have anchor text Thus through carefullychoosing the anchor text linking to his page person B automatically increaseshis Non-PageRank factor scores whilst obtaining his high PageRank score
Obviously these are two extremes but we can use these to extrapolate theadvantages and disadvantages of each approach
11
Advantages DisadvantagesPerson A bull Quick entry to the
results pagesbull Self-generating links
limit the amount ofwork needed
bull Securing the resultsto make it moredifficult forcompetitors tocompete takes morework
bull Slower to react tonew competitors
Person B bull Solid position Caneasily modify on thepage factors for aquick boost ifrequired
bull Probably highertraffic from non-search enginesources
bull Slower entry toresults pages
bull Difficult to do wellbull Increased likelihood
of tripping spamfilters
It is clear that both strategies can and do work Both strategies are usingPageRank as part the mix of factors that will ultimately improve their ranking inthe SERPS Because there is such a mix we can use them to different degreesdepending on the strategy that best suits your style My personal strategy is touse a combination but to save some of the ldquoon the pagerdquo factors for later in caseI need a quick boost if the competition heats up at a later time
Really heavy competition
No explanation of PageRank strategies would be complete without a finalstatement regarding heavy keyword competition There are some queries wherecompetition is so intense that you must do everything possible to maximize yourranking score (for example ldquoweb hostingrdquo) In such situations it is impossible torank highly through Non-PageRank factors alone (as you will not initially be listedhigh enough to be noticed and linked to) That is not to say that Non-PageRankfactors arent important Consider what your final rank score is
Final Rank Score = (score for all Non-PageRank factors) x (actual PageRankscore)
Improving either side of the equation can have a positive effect Howeverbecause the Non-PageRank factors have a restricted maximum benefit theactual PageRank score must be improved in order to compete successfullyUnder really heavy competition ndash it holds true that you cannot rank well unlessyour actual PageRank score is above a certain level In other words
12
There exists a query specific ldquoMinimum PageRank levelrdquo For queries that do nothave heavy competition this level is easy to achieve without even tryingHowever where heavy competition exists Non-PageRank factors are just asimportant (and easier to get) until they reach the Non-PageRank factor thresholdThis is why careful keyword choice can help you avoid the extensive workassociated with highly competitive search phrases
How PageRank is calculated
On a simple level we can tell quite a lot about how PageRank is calculated Thisis because when Google was just a university research project the creators ofPageRank published a paper that detailed a formula for calculating it Thisformula is now more than a handful of years old and we suspect that it haschanged somewhat since then However for detailing the over-riding principlesof how PageRank works it is as accurate today as the day it was written
PR(A) = (1-d) + d (PR(T1) C(T1) + + PR(Tn) C(Tn) )
Where PR(A) is the PageRank of Page A ( the one we want t owork out )
D is a dam pening factor Nom inally this is set to 085
PR(T1) is the PageRank of a site point ing to Page A
C(T1) is the num ber of links off that page
PR(Tn) C(Tn) m eans we do that for each page point ing toPage A
Source The Anatomy of a Large-Scale Hypertextual Web Search EngineSergey Brin and LawrencePage httpwww-dbstanfordedu~backrubgooglehtml
Couldnrsquot be simpler right Depending on your mathematical competence this iseither remarkably easy to understand or remarkably complex So herersquos the dealndash this isnrsquot a one-time calculation It looks pretty above but you canrsquot just do onesimple calculation and get an answer with the PageRank of a page If you look atit to calculate the PageRank of Page A we need to know the PageRank of allthe pages pointing to it To calculate the PageRank of those pages we need toknow the PageRank of all the pages pointing to them (one of which could and isquite likely to be Page A) If this seems like it could go around in circles forever
13
thatrsquos because it nearly does We have to do a whole lot of little calculationsmany times over before we actually get the answer What this formula tells us isthat however you slice it and however the formula may have changed since itwas written
The PageRank given to Page A by a Page B pointing to it is decreased witheach link to anywhere that exists on Page B This means that a pagersquos
PageRank is essentially a measure of its vote it can split that vote between onelink or two or many more but its overall voting power will always remain the
same
This really is important So to be totally clear letrsquos put some numbers to it (thesenumbers are made up for demonstration purposes only and have no relevancyto any particular page) Say Page B has a PageRank value of 5 and has a singlelink on it pointing to Page A Page Arsquos PageRank is improved by a proportion ofPage Brsquos value of 5 (Page B doesnrsquot lose anything but Page A gains) If Page Bhas two links that PageRank improvement would be split and Page A wouldonly gain half the PageRank that it did before
Now put the formula out of your mind for a moment as its easier to understandhow it works using a diagram Letrsquos say we have a hypothetical set of pagesimaginatively titled Page A Page B Page C and Page D They link to each otheras shown below
To begin with in our example at least we donrsquot know what the pagersquos startingPageRanks are Therersquos nothing special about the number that we pick to startwith (in fact if you read forward to the section on convergence ndash you can see that
14
we can start at any number we want) Since in the last version of this paper weperformed the calculations by setting these values to 1 ndash wersquore going to set themto zero this time around in order to prove that it doesnrsquot matter what the startingvalues are
Next we perform the necessary calculation to obtain the PageRank for eachpage The rules are
1 We take a 085 a pagersquos PageRank and divide it by the number oflinks on the page
2 We add that amount on to a new total for each page itrsquos being passedto
3 We add 015 to each of those totals
The first calculation is easy Because wersquove started at zero ndash 0 085 is always 0So each page gets just 015 + 0 Meaning each page now has a PageRank of015 Clearly wersquore not done ndash we want to show the importance of each pagebased upon links and theyrsquore all the same so we need to run the calculationagain
Page A links to pages B C and D Page Arsquos PageRank is 015 so it will add 085 015 = 01275 to the new PageRank scores of the pages it links to There arethree of them so they each get 00425
Page B links to page C Page Brsquos PageRank is 015 so it will add 085 015 =01275 to the new PageRank score of the pages it links to Since it only links topage C page C will get it all
Page C links to Page A all 01275 passes to page A
Page D links to Page C Again all 01275 passes to page C
The new totals for each page them become
Page A 015 (base) + 01275 (from Page C) = 02775Page B 015 (base) + 00425 (from Page A) = 01925Page C 015 (base) + 00425 (from Page A) + 01275 (from Page B) + 01275(from Page D) = 04475Page D 015 (base) + 00425 (from Page A) = 01925
So wersquove got
15
Pretty neat huh Already wersquore begining to see that Page C is probably the mostimportant page in the system (but we canrsquot be sure yet ndash it could well change)We carry on doing these calculations until the value for each page no longerchanges (this is called convergence ndash therersquos more about this in the nextsection) In practice Google probably doesnt wait for this convergence butinstead run a number of iterations of the calculation which is likely to give themfairly accurate values (more on this later as well) If we carried out all thecalculations for the example given it would take us 143 calculations [ ExcelExample 1] and wersquod reach final values of
16
As suspected Page C is the most important If we take a quick look at these rawvalues we can see something about the number of links pointing out from a pageLook at Page A which has a link from a high PageRank page (Page C) whichhas only one outbound link Then look at Page B and D both share links from ahigh PageRank page (Page A) with three outbound links The number of linkssignificantly alters the way PageRank is distributed
A word on convergence
Convergence is an important mathematical aspect of PageRank which allowsGoogle to provided unprecedented search quality at comparably low costs Thisis a semi-complex topic but it is important to your understanding of how and whyPageRank works Wersquove tried to make it as simple as possible but unless yoursquoreSergey Brin or Larry Page yoursquoll still need to concentrate
Whilst it will take some concentration it isnrsquot that hard to understand
Wersquove shown that the outlet values (final values) of one stage of the calculationbecome the inlet values (starting values) of the next stage and that we keep ondoing this (itrsquos known as a recursive procedure) But the really big question ishow and when does the recursive procedure stop
The answer is ldquoconvergencerdquo Provided the dampening factor (d in our equation)is less than one then convergence will occur Nominally we set it to 085(because thatrsquos the value mentioned in the Stanford papers)
This convergence basically means that whatever values we start at after runningthe calculation a number of times we will end up with the same final values andthat these values will no longer change if we do further iterations of thecalculation These final values are known as limiting values
Once the limiting values have been reached Google no longer needs to expendprocessing power on calculating the PageRank They can finish there
This is easier to understand with an example Letrsquos take a look at the followingstructure
17
PageRank for pages A B C D at various stages of iteration
A B C DIteration 1 1 1 1
1 10000000000 05750000000 22750000000 015000000002 20837500000 05750000000 11912500000 015000000003 11625625000 10355937500 16518437500 01500000000
19 14900124031 07833688246 15766187723 0150000000020 14901259564 07832552713 15766187723 0150000000021 14901259564 07833035315 15765705121 0150000000046 14901074052 07832956473 15765969475 0150000000047 14901074054 07832956472 15765969474 0150000000048 14901074053 07832956473 15765969474 0150000000049 14901074053 07832956473 15765969474 0150000000050 14901074053 07832956473 15765969474 0150000000051 14901074053 07832956473 15765969474 0150000000052 14901074053 07832956473 15765969474 0150000000053 14901074053 07832956473 15765969474 0150000000054 14901074053 07832956473 15765969474 01500000000
No matter how much we try to calculate the value past the 48th iteration thevalues do not change ndash they have converged at the limiting values
In practice theres no need to wait until they do not change at all before we stopcalculations We simply wait until they do not change very much In our exampleabove we have done that to 10 decimal places Ie we have calculated until theydo not change by 00000000001 or more (or mathematically | PR 48 (A) - PR 50(A) | lt 10 ndash10)
18
Derived specifics of how Google calculates PageRank
We started our values from zero and now see that the calculations are fairlycomplicated and resource intensive So two questions arise When Googlecalculates PageRank themselves do they start each month with an arbitraryvalue And secondly do they calculate PageRank across several machines Wecan answer these questions by a couple of simple experiments Assume thathaving calculated the PageRanks in the structure shown earlier we would like toadd a page and change a link to make the structure look like this
If we start each page from an arbitrary starting value of one then how manyiterations of the calculation does it take to achieve convergence [ ExcelExample 2]
75
Now if we start each page from the values we ended up with when we calculatedthe PageRank of the original structure (and set Page Ersquos starting value to 1) thenhow many iterations does it take to achieve convergence [ Excel Example 3]
78
These numbers are really very close Therefore it would appear logical thatGoogle does not reset the values with each calculation as they may simply carryon from where they left off
So do they calculate across several machines or one Obviously there is toomuch work for one machine when you get to billions of pages so several
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
Introduction 2What is PageRank 2How is PageRank determined 3How can you tell what a pagersquos PageRank is 3How accurate is the Google toolbar 3How significant is PageRank 5Is PageRank a good determination of the quality of a page 6Googlersquos First 1000 Results 7The difference between PageRank and other factors 8Non-PageRank Factor Threshold 8Using the threshold to derive the worth of two ranking strategies 9Really heavy competition 11How PageRank is calculated 12A word on convergence 16Derived specifics of how Google calculates PageRank 18PageRank Feedback how linking sometimes helps 19Influencing the results 22Controlling PageRank 22Links to Your Site 23Links out of your site 24Internal Structures and Linkages 28What Google says 34Final word to non mathematicians 35Advanced PageRank topics 36Speeding Up PageRank 36How PageRank could be more than it seems 42Favouring pages in directories 44Examples 45Mathematical aspects of PageRank for advanced readers 45Further Reading and Resources 54Technical Proof Readers credits 555
2
Introduction
The original PageRank Explained document has existed for a long time It wasthe first of its kind to present many of the ideas in a way that the average personcan understand You may be wondering why there is a need for a new version ofthis document The reasoning is simple everything in the original document wastheoretical and subject to change These changes have occurred through theauthorsrsquo further learning on how Google reacts and how we think about thesubject To this end you will note that an additional author and technical editorhas been added to the paper Mike Shishiginrsquos extensive mathematical abilityenables us to explore some new areas of PageRank not mentioned before aswell as provide a section at the end that contains more advanced informationWe hope to present PageRank in a way that gets across everything you couldpossibly need to know -- without baffling you However if you want to know theextremely technical stuff then wersquoll give you that at the end
Were pleased that the principles in the original PageRank Explained documentseem to have been accepted universally within the search engine optimizationfield The one document that appears to criticise it later explains why each ofthe points it makes are valid Were also pleased that since its publication a vastamount of additional articles have been written about PageRank
Despite all this things have changed and there are still a lot ofmisunderstandings regarding the importance (or unimportance) of PageRankWersquove added new sections to address these and new sections to address somenew things Weve also tried to make the existing sections moreunderstandable This is a long document itrsquos written so that a broad range ofpeople can understand it As itrsquos also used as a major reference in severaluniversity courses we will do our best detail as much as possible If you are ldquoJoeAveragerdquo then you can simply limit your concern to understanding the principlesuntil youre ready to dig deeper
Finally we thought it was time to dedicate some space to Googles thoughtsabout PageRank Obviously Google is not going to comment on exact rankingissues but as it is their technology what they say about it gives us lots of insightWe asked Barry Schnitt spokesperson for Google a few questions about variousPageRank related topics and he was kind enough to give us some answersyoursquoll find those answers later in the document
What is PageRank
PageRank is Googles method of measuring a pages importance When allother factors such as Title tag and keywords are taken into account Google usesPageRank to adjust results so that sites that are deemed more important willmove up in the results page of a users search accordingly
3
A basic overview of how Google ranks pages in their search engine resultspages (SERPS) follows
1) Find all pages matching the keywords of the search2) Rank accordingly using on the page factors such as keywords3) Calculate in the inbound anchor text4) Adjust the results by PageRank scores
In reality itrsquos slightly more complex and wersquoll discuss this in more depth later butfor now the above description serves our purposes Itrsquos worth noting thatPageRank is a multiplier and is not just simply added to the score Thus if yourpage had a PageRank of zero it would rank at the very end of the SERPS
How is PageRank determined
The Google theory goes that if Page A links to Page B then Page A is sayingthat Page B is an important page PageRank also factors in the importance of thelinks pointing to a page If a page has important links pointing to it then its linksto other pages also become important The actual text of the link is irrelevantwhen discussing PageRank
How can you tell what a pagersquos PageRank is
To learn what a pagersquos PageRank is you can download a toolbar for InternetExplorer from httptoolbargooglecom Once installed there will be a bar graphat the top of the browser showing a version of PageRank for the page yoursquorebrowsing When you hold the mouse over the bar you see a number from zero toten (If you donrsquot see the number you may have an older version of the toolbarinstalled You will need to completely uninstall it reboot your computer andreinstall the latest version Once this is done you should be able to see thePageRank number)
How accurate is the Google toolbar
The Google toolbar is not very accurate in showing you the actual PageRank of asite but itrsquos the only thing right now that can give you any idea As long as youknow the toolbarrsquos limitations then at least you know what you are viewing
There are two limitations to the Google toolbar
1 The toolbar sometimes guesses If you enter a page which is not inits index but where there is a page that is very close to it in Googlersquosindex then it will provide a guesstimate of the PageRank This
4
guesstimate is worthless for our purposes because it isnrsquot featured inany of the PageRank calculations The only way to tell if the toolbar isa guesstimate is to type the URL into the Google search box and see ifthe page shows up in the SERPS If it doesnrsquot then the toolbar isguessing
2 The toolbar is just a representation of actual PageRank WhilstPageRank is linear Google has chosen to use a non-linear graph toportray it So on the toolbar to move from a PageRank of 2 to aPageRank of 3 takes less of an increase than to move from aPageRank of 3 to a PageRank of 4 A comparison table best illustratesthis phenomenon The actual figures are kept secret so wersquoll just useany figures for demonstration purposes
If the actualPageRank is
betweenThe Toolbar Shows
000000001 and 5 16 and 25 2
25 and 125 3126 and 625 4
626 and 3125 53126 and 15625 6
15626 and 78125 778126 and 390625 8
390626 and 1953125 91953126 and infinity 10
The PageRank shown in the Google directory (httpdirectorygooglecom) suffers fromthe same problems The PageRank shown in the directory is also on a differentscale There have been attempts to cross-reference these two scales butbecause they are non-linear the results really do not tell you anything more thanyou already know
Also of note is that a programmer managed to generate a tool to look upPageRank without using Internet Explorer This tool has since been withdrawnbut whilst originally the numbers given by this software and Googlersquos toolbarmatched - presently querying with such software sometimes produces differentnumbers than querying with the toolbar This is Googlersquos right to protect theirdata but is the strongest indication that
5
Sometimes what you see on the Google toolbar may not be related to actualPageRank at all (Google can and does assign whatever toolbar PageRankvalue they want to assign to a page)
How significant is PageRank
The significance of any one factor in search engine algorithms depends on thequality of the information it supplies A factorrsquos importance is known as its weightTo demonstrate how weighting is arrived at its easiest to move away fromPageRank for a second and look at Meta tags Originally when the Metakeyword tag was new you could write something like this in your document
ltmeta name=rdquokeywordsrdquo content=rdquopagerank pagerank uncovered algorithmalgorithmsrdquogt
In theory the Meta keyword tag was a very good indicator of what the page wasabout However as most are well aware ndash the weighting for the keywords tag isfast approaching nothing Two things have contributed to this
1 The ease at which Webmasters can manipulate it2 The level of manipulation by Webmasters
These two things are separate factors but with human nature being what it isthe easier something is to influence ndash the more it is manipulated Thecombination of these factors determines the ldquoweightingrdquo ndash ie how much we trustthe information provided by that factor
So it makes sense to look at these factors in relation to PageRank first
PageRank is without doubt one of the hardest things for a Webmaster tomanipulate ethically However it is possible to generate links to your site fromother sites fairly simply through the use of link farms and guestbooks Googlefrowns upon this kind of abuse and many sites that have tried this have had theirPageRank influence blocked But it must be said that the abuse is still rampantand that it can have an influence on PageRank So whilst not easy to doPageRank is still subject to manipulation
The extent to which PageRank is manipulated has also changed Most people nolonger believe Googlersquos old line of people not being able to influence PageRankand the results based on it However there is more information about PageRankavailable than ever and people are more aware of manipulation techniques
So whilst PageRank is valuable you should be careful not to over-estimate itsusage and capabilities Your final ranking in Google is due to a mix of factors of
6
which PageRank is only one Wersquoll get into more details later by discussing howPageRank is different than the other ranking factors and thus when it appliesand when it doesnrsquot Ironically enough PageRankrsquos weighting factor isundeniably declining Since the original version of this paper gave out detailedinformation about PageRank in all likelihood it may also have contributed insome small way to the decline in weighting of the very subject it talks about
Is PageRank a good determination of the quality of a page
To examine the worth of PageRank we need to first look at its premise and howaccurate it is Basically PageRank says
1 If a page links to another page it is casting a vote which indicates that theother page is good
2 If lots of pages link to a page then it has more votes and its worth shouldbe higher
The basic implication here is People only link to pages they think are good
It shouldnt be hard to convince you that this premise is wrong A few of thereasons people link to pages other than ones they think are good are
1) Reciprocal links ndash ldquoLink to me and Irsquoll link to yourdquo
2) Link Requirements ndash ldquoUsing our script requires you to put a link to ourpagerdquo or ldquoWersquoll give you an award solely because you link to our pagerdquo
3) Friends and Family ndash ldquoThis is my friend Petersquos siterdquo or ldquoMy mumrsquos siteis here my dadrsquos site is here My dogrsquos site is hererdquo
4) Free Page Add-ons ndash ldquoThis counter was provided bywwwlinktocountersitecomrdquo
Furthermore anybody who has a top-ranking site will tell you that it tends to getlinks from new sites This is not necessarily because its good (although theygenerally are) Assume a Webmaster is setting up a new site and they arelooking for some outbound links Nowadays one of the first things they do is aGoogle search for similar sites The links they end up with may not necessarilybe the best sites but merely the easiest ones to find If PageRank influencesrankings and if they subsequently link to those pages ndash the new Webmaster willbe adding to the inaccuracies in the judging of the quality of a page The same istrue when these new Webmasters use the Google Toolbar PageRank indicator tochoose whom to link to
7
To put this another way
PageRank is determined by the links pointing to a page But if PageRank itselfhas an influence on the number of links to a page it is influencing (in acircular way) the quality of that page The links are no longer based solely onhuman judgement If a Webmaster picks their outbound links by searching onGoogle or by looking at the Google toolbar (even if this is only part of theirjudgement) then there is a corresponding increase in a pagersquos PageRankThis increase is not solely because it is a good page but because itsPageRank is already high
Googlersquos First 1000 Results
Remember PageRank alone cannot get you high rankings Wersquove mentionedbefore that PageRank is a multiplier so if your score for all other factors is 0 andyour PageRank is twenty billion then you still score 0 (last in the results) This isnot to say PageRank is worthless but there is some confusion over whenPageRank is useful and when it is not This leads to many misinterpretations ofits worth The only way to clear up these misinterpretations is to point out whenPageRank is not worthwhile
If you perform any broad search on Google it will appear as if youve foundseveral thousand results However you can only view the first 1000 of themUnderstanding why this is so explains why you should always concentrate onldquoon the pagerdquo factors and anchor text first and PageRank last
Assume that you perform a search on Google and it returns 200000 results Ifwe were to calculate every factor for each 200000 pages ndash do you think it wouldreally take just 034 seconds to search The answer to speeding up the search isto get a subset of documents that are most likely to be related to the query Thissubset of documents needs to be larger than the number of search results Forexample letrsquos say that number is 2000 What the search engine does is querythe whole database using 2 or 3 factors finding the 2000 documents that rankhighest for them (Remember there were 200000 possible documents andthatrsquos the number that actually gets shown) Then the engine applies all thefactors to those 2000 and ranks them accordingly Because therersquos a drop in thequality of the results (not the pages) at the bottom of this subset the engine justshows the first 1000 PageRank is almost certainly not one of those factorsNotice how before we highlighted the word ldquorelatedrdquo in creating the subset of2000 pages The search engine is looking for pages that are on-topic If weincluded PageRank in that list wersquod get a lot of high PageRank pages with topicsthat are only slightly related (because of the second factor) but thatrsquos not whatwe want
8
Why this is critical
You must do enough on the page work andor anchor text work to get intothat subset of 2000 pages for your chosen key phrase otherwise your highPageRank will be completely in vain PageRank means nothing if you do nothave enough ranking from other factors to make it into the first subset
The difference between PageRank and other factors
To assess when PageRank is important and when it is not we need tounderstand how PageRank is different from all other ranking factors To do thisherersquos a quick table that lists a few other factors and how they add to the rankingscore
Title Tag Can only be listed onceKeywords in Body Text Each successive repetition is
less important Proximity isimportant
Anchor Text Highly weighted but likekeywords in body text there isa cut off point where furtheranchor text is no longerworthwhile
PageRank Potentially infinite You arealways capable of increasingyour PageRank significantlybut it takes work
All other ranking factors have cut off points beyond which they will no longeradd to your ranking score or will not add significantly enough for it to beworthwhile PageRank has no cut off point
Non-PageRank Factor Threshold
Having written about the difference between PageRank and other factors andhow PageRank is harder to get it should now be clear that whilst we could usemany methods to get good rankings there is a threshold which defines whenhigh PageRank is worth striving for and when it is not
With ranking factors other than PageRank there is a score beyond which theslow down in the rate that any factor adds to this score is so insignificant that it is
9
not worthwhile This is the Non-PageRank Factor Threshold To illustrate thisletrsquos put an example figure on this of 1000
If we have a query where the results are Page A and Page B then Page A and Bhave scores for that query which are the total scores for all ranking factors(including PageRank) Letrsquos say Page Arsquos score is 900 and page Brsquos score is500 Obviously Page A will be listed first These are both below our hypotheticalNon-PageRank Factor Threshold thus without any change in PageRank it ispossible for page B to improve their optimization to beat Page A for this particularquery There are lots of queries like this on Google theyre more commonlythought of as less competitive queries
Now assume Page A raises its score to 1100 Suddenly page B cannot competein the SERPs (search engine results pages) without increasing its PageRank Inall probability page B must also improve for all the other ranking factors but anincrease in PageRank is almost certainly necessary There are also lots ofqueries like this on Google which are more commonly thought of as morecompetitive queries
Generally when querying Google the group of pages in the SERPs will containsome pages that have a score above the Non-PageRank Factor Threshold andsome that do not
There is an important point to be made here
To be competitive you must raise your pages search engine ranking scorebeyond the Non-PageRank Factor Threshold To fail to do so means that youcan easily be beaten in the search results for your query terms The quickestway to approach the Non-PageRank Factor Threshold is through ldquoon the pagefactorsrdquo however you cannot move above the Non-PageRank FactorThreshold without PageRank
The obvious question is whatrsquos the numerical value of the Non-PageRank FactorThreshold and how much work do you need to do to get past it The answer isthat it has no value it is a hypothetical line Google could put a value on it butthat would not help us unless we know what the pagersquos individual scores are Weneed only be aware that the threshold exists and that it gives us informationabout principles
Using the threshold to derive the worth of two ranking strategies
The threshold explains principles and the different ways that search enginemarketers work It also demonstrates why some of the misunderstandings aboutPageRank occur Letrsquos consider the strategies of two people Person A considersPageRank to be unimportant and Person B considers PageRank to be veryimportant
10
Person A says ldquoPageRankrdquo is unimportant They have optimised pages foryears and know how to use ldquoon the pagerdquo factors very successfully Theyunderstand the basics of anchor text but they couldnrsquot care at all aboutPageRank
Whatrsquos happening person A is reaching the Non-PageRank Factor Thresholdvery quickly because they are maximising the ldquoon the pagerdquo factors Throughcarefully choosing keywords they jump-start themselves up the SERPs Aslong as their content is good high-ranking sites (over time) tend to get linkedto Whilst they didnrsquot directly ask for it a slow trickle of sites will begin to link tothem and give them PageRank which helps consolidates their position
Person B says ldquoPageRankrdquo is important Wersquove all seen those pages in theresults that have no content but great rankings (With big brands this canoften occur naturally even when they have no idea what PageRank is Thiswould be Person C who is not relevant to the discussion at hand) Person Bunderstands lots about PageRank and concentrates heavily on it
Whatrsquos happening person B is doing the reverse of person A Whilst person Aconcentrated on the Non-PageRank factors and found herself gettingPageRank anyway person B concentrates on the PageRank Factor and findshimself getting Non-PageRank factors The reason for this is that increasingPageRank requires links and links have anchor text Thus through carefullychoosing the anchor text linking to his page person B automatically increaseshis Non-PageRank factor scores whilst obtaining his high PageRank score
Obviously these are two extremes but we can use these to extrapolate theadvantages and disadvantages of each approach
11
Advantages DisadvantagesPerson A bull Quick entry to the
results pagesbull Self-generating links
limit the amount ofwork needed
bull Securing the resultsto make it moredifficult forcompetitors tocompete takes morework
bull Slower to react tonew competitors
Person B bull Solid position Caneasily modify on thepage factors for aquick boost ifrequired
bull Probably highertraffic from non-search enginesources
bull Slower entry toresults pages
bull Difficult to do wellbull Increased likelihood
of tripping spamfilters
It is clear that both strategies can and do work Both strategies are usingPageRank as part the mix of factors that will ultimately improve their ranking inthe SERPS Because there is such a mix we can use them to different degreesdepending on the strategy that best suits your style My personal strategy is touse a combination but to save some of the ldquoon the pagerdquo factors for later in caseI need a quick boost if the competition heats up at a later time
Really heavy competition
No explanation of PageRank strategies would be complete without a finalstatement regarding heavy keyword competition There are some queries wherecompetition is so intense that you must do everything possible to maximize yourranking score (for example ldquoweb hostingrdquo) In such situations it is impossible torank highly through Non-PageRank factors alone (as you will not initially be listedhigh enough to be noticed and linked to) That is not to say that Non-PageRankfactors arent important Consider what your final rank score is
Final Rank Score = (score for all Non-PageRank factors) x (actual PageRankscore)
Improving either side of the equation can have a positive effect Howeverbecause the Non-PageRank factors have a restricted maximum benefit theactual PageRank score must be improved in order to compete successfullyUnder really heavy competition ndash it holds true that you cannot rank well unlessyour actual PageRank score is above a certain level In other words
12
There exists a query specific ldquoMinimum PageRank levelrdquo For queries that do nothave heavy competition this level is easy to achieve without even tryingHowever where heavy competition exists Non-PageRank factors are just asimportant (and easier to get) until they reach the Non-PageRank factor thresholdThis is why careful keyword choice can help you avoid the extensive workassociated with highly competitive search phrases
How PageRank is calculated
On a simple level we can tell quite a lot about how PageRank is calculated Thisis because when Google was just a university research project the creators ofPageRank published a paper that detailed a formula for calculating it Thisformula is now more than a handful of years old and we suspect that it haschanged somewhat since then However for detailing the over-riding principlesof how PageRank works it is as accurate today as the day it was written
PR(A) = (1-d) + d (PR(T1) C(T1) + + PR(Tn) C(Tn) )
Where PR(A) is the PageRank of Page A ( the one we want t owork out )
D is a dam pening factor Nom inally this is set to 085
PR(T1) is the PageRank of a site point ing to Page A
C(T1) is the num ber of links off that page
PR(Tn) C(Tn) m eans we do that for each page point ing toPage A
Source The Anatomy of a Large-Scale Hypertextual Web Search EngineSergey Brin and LawrencePage httpwww-dbstanfordedu~backrubgooglehtml
Couldnrsquot be simpler right Depending on your mathematical competence this iseither remarkably easy to understand or remarkably complex So herersquos the dealndash this isnrsquot a one-time calculation It looks pretty above but you canrsquot just do onesimple calculation and get an answer with the PageRank of a page If you look atit to calculate the PageRank of Page A we need to know the PageRank of allthe pages pointing to it To calculate the PageRank of those pages we need toknow the PageRank of all the pages pointing to them (one of which could and isquite likely to be Page A) If this seems like it could go around in circles forever
13
thatrsquos because it nearly does We have to do a whole lot of little calculationsmany times over before we actually get the answer What this formula tells us isthat however you slice it and however the formula may have changed since itwas written
The PageRank given to Page A by a Page B pointing to it is decreased witheach link to anywhere that exists on Page B This means that a pagersquos
PageRank is essentially a measure of its vote it can split that vote between onelink or two or many more but its overall voting power will always remain the
same
This really is important So to be totally clear letrsquos put some numbers to it (thesenumbers are made up for demonstration purposes only and have no relevancyto any particular page) Say Page B has a PageRank value of 5 and has a singlelink on it pointing to Page A Page Arsquos PageRank is improved by a proportion ofPage Brsquos value of 5 (Page B doesnrsquot lose anything but Page A gains) If Page Bhas two links that PageRank improvement would be split and Page A wouldonly gain half the PageRank that it did before
Now put the formula out of your mind for a moment as its easier to understandhow it works using a diagram Letrsquos say we have a hypothetical set of pagesimaginatively titled Page A Page B Page C and Page D They link to each otheras shown below
To begin with in our example at least we donrsquot know what the pagersquos startingPageRanks are Therersquos nothing special about the number that we pick to startwith (in fact if you read forward to the section on convergence ndash you can see that
14
we can start at any number we want) Since in the last version of this paper weperformed the calculations by setting these values to 1 ndash wersquore going to set themto zero this time around in order to prove that it doesnrsquot matter what the startingvalues are
Next we perform the necessary calculation to obtain the PageRank for eachpage The rules are
1 We take a 085 a pagersquos PageRank and divide it by the number oflinks on the page
2 We add that amount on to a new total for each page itrsquos being passedto
3 We add 015 to each of those totals
The first calculation is easy Because wersquove started at zero ndash 0 085 is always 0So each page gets just 015 + 0 Meaning each page now has a PageRank of015 Clearly wersquore not done ndash we want to show the importance of each pagebased upon links and theyrsquore all the same so we need to run the calculationagain
Page A links to pages B C and D Page Arsquos PageRank is 015 so it will add 085 015 = 01275 to the new PageRank scores of the pages it links to There arethree of them so they each get 00425
Page B links to page C Page Brsquos PageRank is 015 so it will add 085 015 =01275 to the new PageRank score of the pages it links to Since it only links topage C page C will get it all
Page C links to Page A all 01275 passes to page A
Page D links to Page C Again all 01275 passes to page C
The new totals for each page them become
Page A 015 (base) + 01275 (from Page C) = 02775Page B 015 (base) + 00425 (from Page A) = 01925Page C 015 (base) + 00425 (from Page A) + 01275 (from Page B) + 01275(from Page D) = 04475Page D 015 (base) + 00425 (from Page A) = 01925
So wersquove got
15
Pretty neat huh Already wersquore begining to see that Page C is probably the mostimportant page in the system (but we canrsquot be sure yet ndash it could well change)We carry on doing these calculations until the value for each page no longerchanges (this is called convergence ndash therersquos more about this in the nextsection) In practice Google probably doesnt wait for this convergence butinstead run a number of iterations of the calculation which is likely to give themfairly accurate values (more on this later as well) If we carried out all thecalculations for the example given it would take us 143 calculations [ ExcelExample 1] and wersquod reach final values of
16
As suspected Page C is the most important If we take a quick look at these rawvalues we can see something about the number of links pointing out from a pageLook at Page A which has a link from a high PageRank page (Page C) whichhas only one outbound link Then look at Page B and D both share links from ahigh PageRank page (Page A) with three outbound links The number of linkssignificantly alters the way PageRank is distributed
A word on convergence
Convergence is an important mathematical aspect of PageRank which allowsGoogle to provided unprecedented search quality at comparably low costs Thisis a semi-complex topic but it is important to your understanding of how and whyPageRank works Wersquove tried to make it as simple as possible but unless yoursquoreSergey Brin or Larry Page yoursquoll still need to concentrate
Whilst it will take some concentration it isnrsquot that hard to understand
Wersquove shown that the outlet values (final values) of one stage of the calculationbecome the inlet values (starting values) of the next stage and that we keep ondoing this (itrsquos known as a recursive procedure) But the really big question ishow and when does the recursive procedure stop
The answer is ldquoconvergencerdquo Provided the dampening factor (d in our equation)is less than one then convergence will occur Nominally we set it to 085(because thatrsquos the value mentioned in the Stanford papers)
This convergence basically means that whatever values we start at after runningthe calculation a number of times we will end up with the same final values andthat these values will no longer change if we do further iterations of thecalculation These final values are known as limiting values
Once the limiting values have been reached Google no longer needs to expendprocessing power on calculating the PageRank They can finish there
This is easier to understand with an example Letrsquos take a look at the followingstructure
17
PageRank for pages A B C D at various stages of iteration
A B C DIteration 1 1 1 1
1 10000000000 05750000000 22750000000 015000000002 20837500000 05750000000 11912500000 015000000003 11625625000 10355937500 16518437500 01500000000
19 14900124031 07833688246 15766187723 0150000000020 14901259564 07832552713 15766187723 0150000000021 14901259564 07833035315 15765705121 0150000000046 14901074052 07832956473 15765969475 0150000000047 14901074054 07832956472 15765969474 0150000000048 14901074053 07832956473 15765969474 0150000000049 14901074053 07832956473 15765969474 0150000000050 14901074053 07832956473 15765969474 0150000000051 14901074053 07832956473 15765969474 0150000000052 14901074053 07832956473 15765969474 0150000000053 14901074053 07832956473 15765969474 0150000000054 14901074053 07832956473 15765969474 01500000000
No matter how much we try to calculate the value past the 48th iteration thevalues do not change ndash they have converged at the limiting values
In practice theres no need to wait until they do not change at all before we stopcalculations We simply wait until they do not change very much In our exampleabove we have done that to 10 decimal places Ie we have calculated until theydo not change by 00000000001 or more (or mathematically | PR 48 (A) - PR 50(A) | lt 10 ndash10)
18
Derived specifics of how Google calculates PageRank
We started our values from zero and now see that the calculations are fairlycomplicated and resource intensive So two questions arise When Googlecalculates PageRank themselves do they start each month with an arbitraryvalue And secondly do they calculate PageRank across several machines Wecan answer these questions by a couple of simple experiments Assume thathaving calculated the PageRanks in the structure shown earlier we would like toadd a page and change a link to make the structure look like this
If we start each page from an arbitrary starting value of one then how manyiterations of the calculation does it take to achieve convergence [ ExcelExample 2]
75
Now if we start each page from the values we ended up with when we calculatedthe PageRank of the original structure (and set Page Ersquos starting value to 1) thenhow many iterations does it take to achieve convergence [ Excel Example 3]
78
These numbers are really very close Therefore it would appear logical thatGoogle does not reset the values with each calculation as they may simply carryon from where they left off
So do they calculate across several machines or one Obviously there is toomuch work for one machine when you get to billions of pages so several
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
2
Introduction
The original PageRank Explained document has existed for a long time It wasthe first of its kind to present many of the ideas in a way that the average personcan understand You may be wondering why there is a need for a new version ofthis document The reasoning is simple everything in the original document wastheoretical and subject to change These changes have occurred through theauthorsrsquo further learning on how Google reacts and how we think about thesubject To this end you will note that an additional author and technical editorhas been added to the paper Mike Shishiginrsquos extensive mathematical abilityenables us to explore some new areas of PageRank not mentioned before aswell as provide a section at the end that contains more advanced informationWe hope to present PageRank in a way that gets across everything you couldpossibly need to know -- without baffling you However if you want to know theextremely technical stuff then wersquoll give you that at the end
Were pleased that the principles in the original PageRank Explained documentseem to have been accepted universally within the search engine optimizationfield The one document that appears to criticise it later explains why each ofthe points it makes are valid Were also pleased that since its publication a vastamount of additional articles have been written about PageRank
Despite all this things have changed and there are still a lot ofmisunderstandings regarding the importance (or unimportance) of PageRankWersquove added new sections to address these and new sections to address somenew things Weve also tried to make the existing sections moreunderstandable This is a long document itrsquos written so that a broad range ofpeople can understand it As itrsquos also used as a major reference in severaluniversity courses we will do our best detail as much as possible If you are ldquoJoeAveragerdquo then you can simply limit your concern to understanding the principlesuntil youre ready to dig deeper
Finally we thought it was time to dedicate some space to Googles thoughtsabout PageRank Obviously Google is not going to comment on exact rankingissues but as it is their technology what they say about it gives us lots of insightWe asked Barry Schnitt spokesperson for Google a few questions about variousPageRank related topics and he was kind enough to give us some answersyoursquoll find those answers later in the document
What is PageRank
PageRank is Googles method of measuring a pages importance When allother factors such as Title tag and keywords are taken into account Google usesPageRank to adjust results so that sites that are deemed more important willmove up in the results page of a users search accordingly
3
A basic overview of how Google ranks pages in their search engine resultspages (SERPS) follows
1) Find all pages matching the keywords of the search2) Rank accordingly using on the page factors such as keywords3) Calculate in the inbound anchor text4) Adjust the results by PageRank scores
In reality itrsquos slightly more complex and wersquoll discuss this in more depth later butfor now the above description serves our purposes Itrsquos worth noting thatPageRank is a multiplier and is not just simply added to the score Thus if yourpage had a PageRank of zero it would rank at the very end of the SERPS
How is PageRank determined
The Google theory goes that if Page A links to Page B then Page A is sayingthat Page B is an important page PageRank also factors in the importance of thelinks pointing to a page If a page has important links pointing to it then its linksto other pages also become important The actual text of the link is irrelevantwhen discussing PageRank
How can you tell what a pagersquos PageRank is
To learn what a pagersquos PageRank is you can download a toolbar for InternetExplorer from httptoolbargooglecom Once installed there will be a bar graphat the top of the browser showing a version of PageRank for the page yoursquorebrowsing When you hold the mouse over the bar you see a number from zero toten (If you donrsquot see the number you may have an older version of the toolbarinstalled You will need to completely uninstall it reboot your computer andreinstall the latest version Once this is done you should be able to see thePageRank number)
How accurate is the Google toolbar
The Google toolbar is not very accurate in showing you the actual PageRank of asite but itrsquos the only thing right now that can give you any idea As long as youknow the toolbarrsquos limitations then at least you know what you are viewing
There are two limitations to the Google toolbar
1 The toolbar sometimes guesses If you enter a page which is not inits index but where there is a page that is very close to it in Googlersquosindex then it will provide a guesstimate of the PageRank This
4
guesstimate is worthless for our purposes because it isnrsquot featured inany of the PageRank calculations The only way to tell if the toolbar isa guesstimate is to type the URL into the Google search box and see ifthe page shows up in the SERPS If it doesnrsquot then the toolbar isguessing
2 The toolbar is just a representation of actual PageRank WhilstPageRank is linear Google has chosen to use a non-linear graph toportray it So on the toolbar to move from a PageRank of 2 to aPageRank of 3 takes less of an increase than to move from aPageRank of 3 to a PageRank of 4 A comparison table best illustratesthis phenomenon The actual figures are kept secret so wersquoll just useany figures for demonstration purposes
If the actualPageRank is
betweenThe Toolbar Shows
000000001 and 5 16 and 25 2
25 and 125 3126 and 625 4
626 and 3125 53126 and 15625 6
15626 and 78125 778126 and 390625 8
390626 and 1953125 91953126 and infinity 10
The PageRank shown in the Google directory (httpdirectorygooglecom) suffers fromthe same problems The PageRank shown in the directory is also on a differentscale There have been attempts to cross-reference these two scales butbecause they are non-linear the results really do not tell you anything more thanyou already know
Also of note is that a programmer managed to generate a tool to look upPageRank without using Internet Explorer This tool has since been withdrawnbut whilst originally the numbers given by this software and Googlersquos toolbarmatched - presently querying with such software sometimes produces differentnumbers than querying with the toolbar This is Googlersquos right to protect theirdata but is the strongest indication that
5
Sometimes what you see on the Google toolbar may not be related to actualPageRank at all (Google can and does assign whatever toolbar PageRankvalue they want to assign to a page)
How significant is PageRank
The significance of any one factor in search engine algorithms depends on thequality of the information it supplies A factorrsquos importance is known as its weightTo demonstrate how weighting is arrived at its easiest to move away fromPageRank for a second and look at Meta tags Originally when the Metakeyword tag was new you could write something like this in your document
ltmeta name=rdquokeywordsrdquo content=rdquopagerank pagerank uncovered algorithmalgorithmsrdquogt
In theory the Meta keyword tag was a very good indicator of what the page wasabout However as most are well aware ndash the weighting for the keywords tag isfast approaching nothing Two things have contributed to this
1 The ease at which Webmasters can manipulate it2 The level of manipulation by Webmasters
These two things are separate factors but with human nature being what it isthe easier something is to influence ndash the more it is manipulated Thecombination of these factors determines the ldquoweightingrdquo ndash ie how much we trustthe information provided by that factor
So it makes sense to look at these factors in relation to PageRank first
PageRank is without doubt one of the hardest things for a Webmaster tomanipulate ethically However it is possible to generate links to your site fromother sites fairly simply through the use of link farms and guestbooks Googlefrowns upon this kind of abuse and many sites that have tried this have had theirPageRank influence blocked But it must be said that the abuse is still rampantand that it can have an influence on PageRank So whilst not easy to doPageRank is still subject to manipulation
The extent to which PageRank is manipulated has also changed Most people nolonger believe Googlersquos old line of people not being able to influence PageRankand the results based on it However there is more information about PageRankavailable than ever and people are more aware of manipulation techniques
So whilst PageRank is valuable you should be careful not to over-estimate itsusage and capabilities Your final ranking in Google is due to a mix of factors of
6
which PageRank is only one Wersquoll get into more details later by discussing howPageRank is different than the other ranking factors and thus when it appliesand when it doesnrsquot Ironically enough PageRankrsquos weighting factor isundeniably declining Since the original version of this paper gave out detailedinformation about PageRank in all likelihood it may also have contributed insome small way to the decline in weighting of the very subject it talks about
Is PageRank a good determination of the quality of a page
To examine the worth of PageRank we need to first look at its premise and howaccurate it is Basically PageRank says
1 If a page links to another page it is casting a vote which indicates that theother page is good
2 If lots of pages link to a page then it has more votes and its worth shouldbe higher
The basic implication here is People only link to pages they think are good
It shouldnt be hard to convince you that this premise is wrong A few of thereasons people link to pages other than ones they think are good are
1) Reciprocal links ndash ldquoLink to me and Irsquoll link to yourdquo
2) Link Requirements ndash ldquoUsing our script requires you to put a link to ourpagerdquo or ldquoWersquoll give you an award solely because you link to our pagerdquo
3) Friends and Family ndash ldquoThis is my friend Petersquos siterdquo or ldquoMy mumrsquos siteis here my dadrsquos site is here My dogrsquos site is hererdquo
4) Free Page Add-ons ndash ldquoThis counter was provided bywwwlinktocountersitecomrdquo
Furthermore anybody who has a top-ranking site will tell you that it tends to getlinks from new sites This is not necessarily because its good (although theygenerally are) Assume a Webmaster is setting up a new site and they arelooking for some outbound links Nowadays one of the first things they do is aGoogle search for similar sites The links they end up with may not necessarilybe the best sites but merely the easiest ones to find If PageRank influencesrankings and if they subsequently link to those pages ndash the new Webmaster willbe adding to the inaccuracies in the judging of the quality of a page The same istrue when these new Webmasters use the Google Toolbar PageRank indicator tochoose whom to link to
7
To put this another way
PageRank is determined by the links pointing to a page But if PageRank itselfhas an influence on the number of links to a page it is influencing (in acircular way) the quality of that page The links are no longer based solely onhuman judgement If a Webmaster picks their outbound links by searching onGoogle or by looking at the Google toolbar (even if this is only part of theirjudgement) then there is a corresponding increase in a pagersquos PageRankThis increase is not solely because it is a good page but because itsPageRank is already high
Googlersquos First 1000 Results
Remember PageRank alone cannot get you high rankings Wersquove mentionedbefore that PageRank is a multiplier so if your score for all other factors is 0 andyour PageRank is twenty billion then you still score 0 (last in the results) This isnot to say PageRank is worthless but there is some confusion over whenPageRank is useful and when it is not This leads to many misinterpretations ofits worth The only way to clear up these misinterpretations is to point out whenPageRank is not worthwhile
If you perform any broad search on Google it will appear as if youve foundseveral thousand results However you can only view the first 1000 of themUnderstanding why this is so explains why you should always concentrate onldquoon the pagerdquo factors and anchor text first and PageRank last
Assume that you perform a search on Google and it returns 200000 results Ifwe were to calculate every factor for each 200000 pages ndash do you think it wouldreally take just 034 seconds to search The answer to speeding up the search isto get a subset of documents that are most likely to be related to the query Thissubset of documents needs to be larger than the number of search results Forexample letrsquos say that number is 2000 What the search engine does is querythe whole database using 2 or 3 factors finding the 2000 documents that rankhighest for them (Remember there were 200000 possible documents andthatrsquos the number that actually gets shown) Then the engine applies all thefactors to those 2000 and ranks them accordingly Because therersquos a drop in thequality of the results (not the pages) at the bottom of this subset the engine justshows the first 1000 PageRank is almost certainly not one of those factorsNotice how before we highlighted the word ldquorelatedrdquo in creating the subset of2000 pages The search engine is looking for pages that are on-topic If weincluded PageRank in that list wersquod get a lot of high PageRank pages with topicsthat are only slightly related (because of the second factor) but thatrsquos not whatwe want
8
Why this is critical
You must do enough on the page work andor anchor text work to get intothat subset of 2000 pages for your chosen key phrase otherwise your highPageRank will be completely in vain PageRank means nothing if you do nothave enough ranking from other factors to make it into the first subset
The difference between PageRank and other factors
To assess when PageRank is important and when it is not we need tounderstand how PageRank is different from all other ranking factors To do thisherersquos a quick table that lists a few other factors and how they add to the rankingscore
Title Tag Can only be listed onceKeywords in Body Text Each successive repetition is
less important Proximity isimportant
Anchor Text Highly weighted but likekeywords in body text there isa cut off point where furtheranchor text is no longerworthwhile
PageRank Potentially infinite You arealways capable of increasingyour PageRank significantlybut it takes work
All other ranking factors have cut off points beyond which they will no longeradd to your ranking score or will not add significantly enough for it to beworthwhile PageRank has no cut off point
Non-PageRank Factor Threshold
Having written about the difference between PageRank and other factors andhow PageRank is harder to get it should now be clear that whilst we could usemany methods to get good rankings there is a threshold which defines whenhigh PageRank is worth striving for and when it is not
With ranking factors other than PageRank there is a score beyond which theslow down in the rate that any factor adds to this score is so insignificant that it is
9
not worthwhile This is the Non-PageRank Factor Threshold To illustrate thisletrsquos put an example figure on this of 1000
If we have a query where the results are Page A and Page B then Page A and Bhave scores for that query which are the total scores for all ranking factors(including PageRank) Letrsquos say Page Arsquos score is 900 and page Brsquos score is500 Obviously Page A will be listed first These are both below our hypotheticalNon-PageRank Factor Threshold thus without any change in PageRank it ispossible for page B to improve their optimization to beat Page A for this particularquery There are lots of queries like this on Google theyre more commonlythought of as less competitive queries
Now assume Page A raises its score to 1100 Suddenly page B cannot competein the SERPs (search engine results pages) without increasing its PageRank Inall probability page B must also improve for all the other ranking factors but anincrease in PageRank is almost certainly necessary There are also lots ofqueries like this on Google which are more commonly thought of as morecompetitive queries
Generally when querying Google the group of pages in the SERPs will containsome pages that have a score above the Non-PageRank Factor Threshold andsome that do not
There is an important point to be made here
To be competitive you must raise your pages search engine ranking scorebeyond the Non-PageRank Factor Threshold To fail to do so means that youcan easily be beaten in the search results for your query terms The quickestway to approach the Non-PageRank Factor Threshold is through ldquoon the pagefactorsrdquo however you cannot move above the Non-PageRank FactorThreshold without PageRank
The obvious question is whatrsquos the numerical value of the Non-PageRank FactorThreshold and how much work do you need to do to get past it The answer isthat it has no value it is a hypothetical line Google could put a value on it butthat would not help us unless we know what the pagersquos individual scores are Weneed only be aware that the threshold exists and that it gives us informationabout principles
Using the threshold to derive the worth of two ranking strategies
The threshold explains principles and the different ways that search enginemarketers work It also demonstrates why some of the misunderstandings aboutPageRank occur Letrsquos consider the strategies of two people Person A considersPageRank to be unimportant and Person B considers PageRank to be veryimportant
10
Person A says ldquoPageRankrdquo is unimportant They have optimised pages foryears and know how to use ldquoon the pagerdquo factors very successfully Theyunderstand the basics of anchor text but they couldnrsquot care at all aboutPageRank
Whatrsquos happening person A is reaching the Non-PageRank Factor Thresholdvery quickly because they are maximising the ldquoon the pagerdquo factors Throughcarefully choosing keywords they jump-start themselves up the SERPs Aslong as their content is good high-ranking sites (over time) tend to get linkedto Whilst they didnrsquot directly ask for it a slow trickle of sites will begin to link tothem and give them PageRank which helps consolidates their position
Person B says ldquoPageRankrdquo is important Wersquove all seen those pages in theresults that have no content but great rankings (With big brands this canoften occur naturally even when they have no idea what PageRank is Thiswould be Person C who is not relevant to the discussion at hand) Person Bunderstands lots about PageRank and concentrates heavily on it
Whatrsquos happening person B is doing the reverse of person A Whilst person Aconcentrated on the Non-PageRank factors and found herself gettingPageRank anyway person B concentrates on the PageRank Factor and findshimself getting Non-PageRank factors The reason for this is that increasingPageRank requires links and links have anchor text Thus through carefullychoosing the anchor text linking to his page person B automatically increaseshis Non-PageRank factor scores whilst obtaining his high PageRank score
Obviously these are two extremes but we can use these to extrapolate theadvantages and disadvantages of each approach
11
Advantages DisadvantagesPerson A bull Quick entry to the
results pagesbull Self-generating links
limit the amount ofwork needed
bull Securing the resultsto make it moredifficult forcompetitors tocompete takes morework
bull Slower to react tonew competitors
Person B bull Solid position Caneasily modify on thepage factors for aquick boost ifrequired
bull Probably highertraffic from non-search enginesources
bull Slower entry toresults pages
bull Difficult to do wellbull Increased likelihood
of tripping spamfilters
It is clear that both strategies can and do work Both strategies are usingPageRank as part the mix of factors that will ultimately improve their ranking inthe SERPS Because there is such a mix we can use them to different degreesdepending on the strategy that best suits your style My personal strategy is touse a combination but to save some of the ldquoon the pagerdquo factors for later in caseI need a quick boost if the competition heats up at a later time
Really heavy competition
No explanation of PageRank strategies would be complete without a finalstatement regarding heavy keyword competition There are some queries wherecompetition is so intense that you must do everything possible to maximize yourranking score (for example ldquoweb hostingrdquo) In such situations it is impossible torank highly through Non-PageRank factors alone (as you will not initially be listedhigh enough to be noticed and linked to) That is not to say that Non-PageRankfactors arent important Consider what your final rank score is
Final Rank Score = (score for all Non-PageRank factors) x (actual PageRankscore)
Improving either side of the equation can have a positive effect Howeverbecause the Non-PageRank factors have a restricted maximum benefit theactual PageRank score must be improved in order to compete successfullyUnder really heavy competition ndash it holds true that you cannot rank well unlessyour actual PageRank score is above a certain level In other words
12
There exists a query specific ldquoMinimum PageRank levelrdquo For queries that do nothave heavy competition this level is easy to achieve without even tryingHowever where heavy competition exists Non-PageRank factors are just asimportant (and easier to get) until they reach the Non-PageRank factor thresholdThis is why careful keyword choice can help you avoid the extensive workassociated with highly competitive search phrases
How PageRank is calculated
On a simple level we can tell quite a lot about how PageRank is calculated Thisis because when Google was just a university research project the creators ofPageRank published a paper that detailed a formula for calculating it Thisformula is now more than a handful of years old and we suspect that it haschanged somewhat since then However for detailing the over-riding principlesof how PageRank works it is as accurate today as the day it was written
PR(A) = (1-d) + d (PR(T1) C(T1) + + PR(Tn) C(Tn) )
Where PR(A) is the PageRank of Page A ( the one we want t owork out )
D is a dam pening factor Nom inally this is set to 085
PR(T1) is the PageRank of a site point ing to Page A
C(T1) is the num ber of links off that page
PR(Tn) C(Tn) m eans we do that for each page point ing toPage A
Source The Anatomy of a Large-Scale Hypertextual Web Search EngineSergey Brin and LawrencePage httpwww-dbstanfordedu~backrubgooglehtml
Couldnrsquot be simpler right Depending on your mathematical competence this iseither remarkably easy to understand or remarkably complex So herersquos the dealndash this isnrsquot a one-time calculation It looks pretty above but you canrsquot just do onesimple calculation and get an answer with the PageRank of a page If you look atit to calculate the PageRank of Page A we need to know the PageRank of allthe pages pointing to it To calculate the PageRank of those pages we need toknow the PageRank of all the pages pointing to them (one of which could and isquite likely to be Page A) If this seems like it could go around in circles forever
13
thatrsquos because it nearly does We have to do a whole lot of little calculationsmany times over before we actually get the answer What this formula tells us isthat however you slice it and however the formula may have changed since itwas written
The PageRank given to Page A by a Page B pointing to it is decreased witheach link to anywhere that exists on Page B This means that a pagersquos
PageRank is essentially a measure of its vote it can split that vote between onelink or two or many more but its overall voting power will always remain the
same
This really is important So to be totally clear letrsquos put some numbers to it (thesenumbers are made up for demonstration purposes only and have no relevancyto any particular page) Say Page B has a PageRank value of 5 and has a singlelink on it pointing to Page A Page Arsquos PageRank is improved by a proportion ofPage Brsquos value of 5 (Page B doesnrsquot lose anything but Page A gains) If Page Bhas two links that PageRank improvement would be split and Page A wouldonly gain half the PageRank that it did before
Now put the formula out of your mind for a moment as its easier to understandhow it works using a diagram Letrsquos say we have a hypothetical set of pagesimaginatively titled Page A Page B Page C and Page D They link to each otheras shown below
To begin with in our example at least we donrsquot know what the pagersquos startingPageRanks are Therersquos nothing special about the number that we pick to startwith (in fact if you read forward to the section on convergence ndash you can see that
14
we can start at any number we want) Since in the last version of this paper weperformed the calculations by setting these values to 1 ndash wersquore going to set themto zero this time around in order to prove that it doesnrsquot matter what the startingvalues are
Next we perform the necessary calculation to obtain the PageRank for eachpage The rules are
1 We take a 085 a pagersquos PageRank and divide it by the number oflinks on the page
2 We add that amount on to a new total for each page itrsquos being passedto
3 We add 015 to each of those totals
The first calculation is easy Because wersquove started at zero ndash 0 085 is always 0So each page gets just 015 + 0 Meaning each page now has a PageRank of015 Clearly wersquore not done ndash we want to show the importance of each pagebased upon links and theyrsquore all the same so we need to run the calculationagain
Page A links to pages B C and D Page Arsquos PageRank is 015 so it will add 085 015 = 01275 to the new PageRank scores of the pages it links to There arethree of them so they each get 00425
Page B links to page C Page Brsquos PageRank is 015 so it will add 085 015 =01275 to the new PageRank score of the pages it links to Since it only links topage C page C will get it all
Page C links to Page A all 01275 passes to page A
Page D links to Page C Again all 01275 passes to page C
The new totals for each page them become
Page A 015 (base) + 01275 (from Page C) = 02775Page B 015 (base) + 00425 (from Page A) = 01925Page C 015 (base) + 00425 (from Page A) + 01275 (from Page B) + 01275(from Page D) = 04475Page D 015 (base) + 00425 (from Page A) = 01925
So wersquove got
15
Pretty neat huh Already wersquore begining to see that Page C is probably the mostimportant page in the system (but we canrsquot be sure yet ndash it could well change)We carry on doing these calculations until the value for each page no longerchanges (this is called convergence ndash therersquos more about this in the nextsection) In practice Google probably doesnt wait for this convergence butinstead run a number of iterations of the calculation which is likely to give themfairly accurate values (more on this later as well) If we carried out all thecalculations for the example given it would take us 143 calculations [ ExcelExample 1] and wersquod reach final values of
16
As suspected Page C is the most important If we take a quick look at these rawvalues we can see something about the number of links pointing out from a pageLook at Page A which has a link from a high PageRank page (Page C) whichhas only one outbound link Then look at Page B and D both share links from ahigh PageRank page (Page A) with three outbound links The number of linkssignificantly alters the way PageRank is distributed
A word on convergence
Convergence is an important mathematical aspect of PageRank which allowsGoogle to provided unprecedented search quality at comparably low costs Thisis a semi-complex topic but it is important to your understanding of how and whyPageRank works Wersquove tried to make it as simple as possible but unless yoursquoreSergey Brin or Larry Page yoursquoll still need to concentrate
Whilst it will take some concentration it isnrsquot that hard to understand
Wersquove shown that the outlet values (final values) of one stage of the calculationbecome the inlet values (starting values) of the next stage and that we keep ondoing this (itrsquos known as a recursive procedure) But the really big question ishow and when does the recursive procedure stop
The answer is ldquoconvergencerdquo Provided the dampening factor (d in our equation)is less than one then convergence will occur Nominally we set it to 085(because thatrsquos the value mentioned in the Stanford papers)
This convergence basically means that whatever values we start at after runningthe calculation a number of times we will end up with the same final values andthat these values will no longer change if we do further iterations of thecalculation These final values are known as limiting values
Once the limiting values have been reached Google no longer needs to expendprocessing power on calculating the PageRank They can finish there
This is easier to understand with an example Letrsquos take a look at the followingstructure
17
PageRank for pages A B C D at various stages of iteration
A B C DIteration 1 1 1 1
1 10000000000 05750000000 22750000000 015000000002 20837500000 05750000000 11912500000 015000000003 11625625000 10355937500 16518437500 01500000000
19 14900124031 07833688246 15766187723 0150000000020 14901259564 07832552713 15766187723 0150000000021 14901259564 07833035315 15765705121 0150000000046 14901074052 07832956473 15765969475 0150000000047 14901074054 07832956472 15765969474 0150000000048 14901074053 07832956473 15765969474 0150000000049 14901074053 07832956473 15765969474 0150000000050 14901074053 07832956473 15765969474 0150000000051 14901074053 07832956473 15765969474 0150000000052 14901074053 07832956473 15765969474 0150000000053 14901074053 07832956473 15765969474 0150000000054 14901074053 07832956473 15765969474 01500000000
No matter how much we try to calculate the value past the 48th iteration thevalues do not change ndash they have converged at the limiting values
In practice theres no need to wait until they do not change at all before we stopcalculations We simply wait until they do not change very much In our exampleabove we have done that to 10 decimal places Ie we have calculated until theydo not change by 00000000001 or more (or mathematically | PR 48 (A) - PR 50(A) | lt 10 ndash10)
18
Derived specifics of how Google calculates PageRank
We started our values from zero and now see that the calculations are fairlycomplicated and resource intensive So two questions arise When Googlecalculates PageRank themselves do they start each month with an arbitraryvalue And secondly do they calculate PageRank across several machines Wecan answer these questions by a couple of simple experiments Assume thathaving calculated the PageRanks in the structure shown earlier we would like toadd a page and change a link to make the structure look like this
If we start each page from an arbitrary starting value of one then how manyiterations of the calculation does it take to achieve convergence [ ExcelExample 2]
75
Now if we start each page from the values we ended up with when we calculatedthe PageRank of the original structure (and set Page Ersquos starting value to 1) thenhow many iterations does it take to achieve convergence [ Excel Example 3]
78
These numbers are really very close Therefore it would appear logical thatGoogle does not reset the values with each calculation as they may simply carryon from where they left off
So do they calculate across several machines or one Obviously there is toomuch work for one machine when you get to billions of pages so several
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
3
A basic overview of how Google ranks pages in their search engine resultspages (SERPS) follows
1) Find all pages matching the keywords of the search2) Rank accordingly using on the page factors such as keywords3) Calculate in the inbound anchor text4) Adjust the results by PageRank scores
In reality itrsquos slightly more complex and wersquoll discuss this in more depth later butfor now the above description serves our purposes Itrsquos worth noting thatPageRank is a multiplier and is not just simply added to the score Thus if yourpage had a PageRank of zero it would rank at the very end of the SERPS
How is PageRank determined
The Google theory goes that if Page A links to Page B then Page A is sayingthat Page B is an important page PageRank also factors in the importance of thelinks pointing to a page If a page has important links pointing to it then its linksto other pages also become important The actual text of the link is irrelevantwhen discussing PageRank
How can you tell what a pagersquos PageRank is
To learn what a pagersquos PageRank is you can download a toolbar for InternetExplorer from httptoolbargooglecom Once installed there will be a bar graphat the top of the browser showing a version of PageRank for the page yoursquorebrowsing When you hold the mouse over the bar you see a number from zero toten (If you donrsquot see the number you may have an older version of the toolbarinstalled You will need to completely uninstall it reboot your computer andreinstall the latest version Once this is done you should be able to see thePageRank number)
How accurate is the Google toolbar
The Google toolbar is not very accurate in showing you the actual PageRank of asite but itrsquos the only thing right now that can give you any idea As long as youknow the toolbarrsquos limitations then at least you know what you are viewing
There are two limitations to the Google toolbar
1 The toolbar sometimes guesses If you enter a page which is not inits index but where there is a page that is very close to it in Googlersquosindex then it will provide a guesstimate of the PageRank This
4
guesstimate is worthless for our purposes because it isnrsquot featured inany of the PageRank calculations The only way to tell if the toolbar isa guesstimate is to type the URL into the Google search box and see ifthe page shows up in the SERPS If it doesnrsquot then the toolbar isguessing
2 The toolbar is just a representation of actual PageRank WhilstPageRank is linear Google has chosen to use a non-linear graph toportray it So on the toolbar to move from a PageRank of 2 to aPageRank of 3 takes less of an increase than to move from aPageRank of 3 to a PageRank of 4 A comparison table best illustratesthis phenomenon The actual figures are kept secret so wersquoll just useany figures for demonstration purposes
If the actualPageRank is
betweenThe Toolbar Shows
000000001 and 5 16 and 25 2
25 and 125 3126 and 625 4
626 and 3125 53126 and 15625 6
15626 and 78125 778126 and 390625 8
390626 and 1953125 91953126 and infinity 10
The PageRank shown in the Google directory (httpdirectorygooglecom) suffers fromthe same problems The PageRank shown in the directory is also on a differentscale There have been attempts to cross-reference these two scales butbecause they are non-linear the results really do not tell you anything more thanyou already know
Also of note is that a programmer managed to generate a tool to look upPageRank without using Internet Explorer This tool has since been withdrawnbut whilst originally the numbers given by this software and Googlersquos toolbarmatched - presently querying with such software sometimes produces differentnumbers than querying with the toolbar This is Googlersquos right to protect theirdata but is the strongest indication that
5
Sometimes what you see on the Google toolbar may not be related to actualPageRank at all (Google can and does assign whatever toolbar PageRankvalue they want to assign to a page)
How significant is PageRank
The significance of any one factor in search engine algorithms depends on thequality of the information it supplies A factorrsquos importance is known as its weightTo demonstrate how weighting is arrived at its easiest to move away fromPageRank for a second and look at Meta tags Originally when the Metakeyword tag was new you could write something like this in your document
ltmeta name=rdquokeywordsrdquo content=rdquopagerank pagerank uncovered algorithmalgorithmsrdquogt
In theory the Meta keyword tag was a very good indicator of what the page wasabout However as most are well aware ndash the weighting for the keywords tag isfast approaching nothing Two things have contributed to this
1 The ease at which Webmasters can manipulate it2 The level of manipulation by Webmasters
These two things are separate factors but with human nature being what it isthe easier something is to influence ndash the more it is manipulated Thecombination of these factors determines the ldquoweightingrdquo ndash ie how much we trustthe information provided by that factor
So it makes sense to look at these factors in relation to PageRank first
PageRank is without doubt one of the hardest things for a Webmaster tomanipulate ethically However it is possible to generate links to your site fromother sites fairly simply through the use of link farms and guestbooks Googlefrowns upon this kind of abuse and many sites that have tried this have had theirPageRank influence blocked But it must be said that the abuse is still rampantand that it can have an influence on PageRank So whilst not easy to doPageRank is still subject to manipulation
The extent to which PageRank is manipulated has also changed Most people nolonger believe Googlersquos old line of people not being able to influence PageRankand the results based on it However there is more information about PageRankavailable than ever and people are more aware of manipulation techniques
So whilst PageRank is valuable you should be careful not to over-estimate itsusage and capabilities Your final ranking in Google is due to a mix of factors of
6
which PageRank is only one Wersquoll get into more details later by discussing howPageRank is different than the other ranking factors and thus when it appliesand when it doesnrsquot Ironically enough PageRankrsquos weighting factor isundeniably declining Since the original version of this paper gave out detailedinformation about PageRank in all likelihood it may also have contributed insome small way to the decline in weighting of the very subject it talks about
Is PageRank a good determination of the quality of a page
To examine the worth of PageRank we need to first look at its premise and howaccurate it is Basically PageRank says
1 If a page links to another page it is casting a vote which indicates that theother page is good
2 If lots of pages link to a page then it has more votes and its worth shouldbe higher
The basic implication here is People only link to pages they think are good
It shouldnt be hard to convince you that this premise is wrong A few of thereasons people link to pages other than ones they think are good are
1) Reciprocal links ndash ldquoLink to me and Irsquoll link to yourdquo
2) Link Requirements ndash ldquoUsing our script requires you to put a link to ourpagerdquo or ldquoWersquoll give you an award solely because you link to our pagerdquo
3) Friends and Family ndash ldquoThis is my friend Petersquos siterdquo or ldquoMy mumrsquos siteis here my dadrsquos site is here My dogrsquos site is hererdquo
4) Free Page Add-ons ndash ldquoThis counter was provided bywwwlinktocountersitecomrdquo
Furthermore anybody who has a top-ranking site will tell you that it tends to getlinks from new sites This is not necessarily because its good (although theygenerally are) Assume a Webmaster is setting up a new site and they arelooking for some outbound links Nowadays one of the first things they do is aGoogle search for similar sites The links they end up with may not necessarilybe the best sites but merely the easiest ones to find If PageRank influencesrankings and if they subsequently link to those pages ndash the new Webmaster willbe adding to the inaccuracies in the judging of the quality of a page The same istrue when these new Webmasters use the Google Toolbar PageRank indicator tochoose whom to link to
7
To put this another way
PageRank is determined by the links pointing to a page But if PageRank itselfhas an influence on the number of links to a page it is influencing (in acircular way) the quality of that page The links are no longer based solely onhuman judgement If a Webmaster picks their outbound links by searching onGoogle or by looking at the Google toolbar (even if this is only part of theirjudgement) then there is a corresponding increase in a pagersquos PageRankThis increase is not solely because it is a good page but because itsPageRank is already high
Googlersquos First 1000 Results
Remember PageRank alone cannot get you high rankings Wersquove mentionedbefore that PageRank is a multiplier so if your score for all other factors is 0 andyour PageRank is twenty billion then you still score 0 (last in the results) This isnot to say PageRank is worthless but there is some confusion over whenPageRank is useful and when it is not This leads to many misinterpretations ofits worth The only way to clear up these misinterpretations is to point out whenPageRank is not worthwhile
If you perform any broad search on Google it will appear as if youve foundseveral thousand results However you can only view the first 1000 of themUnderstanding why this is so explains why you should always concentrate onldquoon the pagerdquo factors and anchor text first and PageRank last
Assume that you perform a search on Google and it returns 200000 results Ifwe were to calculate every factor for each 200000 pages ndash do you think it wouldreally take just 034 seconds to search The answer to speeding up the search isto get a subset of documents that are most likely to be related to the query Thissubset of documents needs to be larger than the number of search results Forexample letrsquos say that number is 2000 What the search engine does is querythe whole database using 2 or 3 factors finding the 2000 documents that rankhighest for them (Remember there were 200000 possible documents andthatrsquos the number that actually gets shown) Then the engine applies all thefactors to those 2000 and ranks them accordingly Because therersquos a drop in thequality of the results (not the pages) at the bottom of this subset the engine justshows the first 1000 PageRank is almost certainly not one of those factorsNotice how before we highlighted the word ldquorelatedrdquo in creating the subset of2000 pages The search engine is looking for pages that are on-topic If weincluded PageRank in that list wersquod get a lot of high PageRank pages with topicsthat are only slightly related (because of the second factor) but thatrsquos not whatwe want
8
Why this is critical
You must do enough on the page work andor anchor text work to get intothat subset of 2000 pages for your chosen key phrase otherwise your highPageRank will be completely in vain PageRank means nothing if you do nothave enough ranking from other factors to make it into the first subset
The difference between PageRank and other factors
To assess when PageRank is important and when it is not we need tounderstand how PageRank is different from all other ranking factors To do thisherersquos a quick table that lists a few other factors and how they add to the rankingscore
Title Tag Can only be listed onceKeywords in Body Text Each successive repetition is
less important Proximity isimportant
Anchor Text Highly weighted but likekeywords in body text there isa cut off point where furtheranchor text is no longerworthwhile
PageRank Potentially infinite You arealways capable of increasingyour PageRank significantlybut it takes work
All other ranking factors have cut off points beyond which they will no longeradd to your ranking score or will not add significantly enough for it to beworthwhile PageRank has no cut off point
Non-PageRank Factor Threshold
Having written about the difference between PageRank and other factors andhow PageRank is harder to get it should now be clear that whilst we could usemany methods to get good rankings there is a threshold which defines whenhigh PageRank is worth striving for and when it is not
With ranking factors other than PageRank there is a score beyond which theslow down in the rate that any factor adds to this score is so insignificant that it is
9
not worthwhile This is the Non-PageRank Factor Threshold To illustrate thisletrsquos put an example figure on this of 1000
If we have a query where the results are Page A and Page B then Page A and Bhave scores for that query which are the total scores for all ranking factors(including PageRank) Letrsquos say Page Arsquos score is 900 and page Brsquos score is500 Obviously Page A will be listed first These are both below our hypotheticalNon-PageRank Factor Threshold thus without any change in PageRank it ispossible for page B to improve their optimization to beat Page A for this particularquery There are lots of queries like this on Google theyre more commonlythought of as less competitive queries
Now assume Page A raises its score to 1100 Suddenly page B cannot competein the SERPs (search engine results pages) without increasing its PageRank Inall probability page B must also improve for all the other ranking factors but anincrease in PageRank is almost certainly necessary There are also lots ofqueries like this on Google which are more commonly thought of as morecompetitive queries
Generally when querying Google the group of pages in the SERPs will containsome pages that have a score above the Non-PageRank Factor Threshold andsome that do not
There is an important point to be made here
To be competitive you must raise your pages search engine ranking scorebeyond the Non-PageRank Factor Threshold To fail to do so means that youcan easily be beaten in the search results for your query terms The quickestway to approach the Non-PageRank Factor Threshold is through ldquoon the pagefactorsrdquo however you cannot move above the Non-PageRank FactorThreshold without PageRank
The obvious question is whatrsquos the numerical value of the Non-PageRank FactorThreshold and how much work do you need to do to get past it The answer isthat it has no value it is a hypothetical line Google could put a value on it butthat would not help us unless we know what the pagersquos individual scores are Weneed only be aware that the threshold exists and that it gives us informationabout principles
Using the threshold to derive the worth of two ranking strategies
The threshold explains principles and the different ways that search enginemarketers work It also demonstrates why some of the misunderstandings aboutPageRank occur Letrsquos consider the strategies of two people Person A considersPageRank to be unimportant and Person B considers PageRank to be veryimportant
10
Person A says ldquoPageRankrdquo is unimportant They have optimised pages foryears and know how to use ldquoon the pagerdquo factors very successfully Theyunderstand the basics of anchor text but they couldnrsquot care at all aboutPageRank
Whatrsquos happening person A is reaching the Non-PageRank Factor Thresholdvery quickly because they are maximising the ldquoon the pagerdquo factors Throughcarefully choosing keywords they jump-start themselves up the SERPs Aslong as their content is good high-ranking sites (over time) tend to get linkedto Whilst they didnrsquot directly ask for it a slow trickle of sites will begin to link tothem and give them PageRank which helps consolidates their position
Person B says ldquoPageRankrdquo is important Wersquove all seen those pages in theresults that have no content but great rankings (With big brands this canoften occur naturally even when they have no idea what PageRank is Thiswould be Person C who is not relevant to the discussion at hand) Person Bunderstands lots about PageRank and concentrates heavily on it
Whatrsquos happening person B is doing the reverse of person A Whilst person Aconcentrated on the Non-PageRank factors and found herself gettingPageRank anyway person B concentrates on the PageRank Factor and findshimself getting Non-PageRank factors The reason for this is that increasingPageRank requires links and links have anchor text Thus through carefullychoosing the anchor text linking to his page person B automatically increaseshis Non-PageRank factor scores whilst obtaining his high PageRank score
Obviously these are two extremes but we can use these to extrapolate theadvantages and disadvantages of each approach
11
Advantages DisadvantagesPerson A bull Quick entry to the
results pagesbull Self-generating links
limit the amount ofwork needed
bull Securing the resultsto make it moredifficult forcompetitors tocompete takes morework
bull Slower to react tonew competitors
Person B bull Solid position Caneasily modify on thepage factors for aquick boost ifrequired
bull Probably highertraffic from non-search enginesources
bull Slower entry toresults pages
bull Difficult to do wellbull Increased likelihood
of tripping spamfilters
It is clear that both strategies can and do work Both strategies are usingPageRank as part the mix of factors that will ultimately improve their ranking inthe SERPS Because there is such a mix we can use them to different degreesdepending on the strategy that best suits your style My personal strategy is touse a combination but to save some of the ldquoon the pagerdquo factors for later in caseI need a quick boost if the competition heats up at a later time
Really heavy competition
No explanation of PageRank strategies would be complete without a finalstatement regarding heavy keyword competition There are some queries wherecompetition is so intense that you must do everything possible to maximize yourranking score (for example ldquoweb hostingrdquo) In such situations it is impossible torank highly through Non-PageRank factors alone (as you will not initially be listedhigh enough to be noticed and linked to) That is not to say that Non-PageRankfactors arent important Consider what your final rank score is
Final Rank Score = (score for all Non-PageRank factors) x (actual PageRankscore)
Improving either side of the equation can have a positive effect Howeverbecause the Non-PageRank factors have a restricted maximum benefit theactual PageRank score must be improved in order to compete successfullyUnder really heavy competition ndash it holds true that you cannot rank well unlessyour actual PageRank score is above a certain level In other words
12
There exists a query specific ldquoMinimum PageRank levelrdquo For queries that do nothave heavy competition this level is easy to achieve without even tryingHowever where heavy competition exists Non-PageRank factors are just asimportant (and easier to get) until they reach the Non-PageRank factor thresholdThis is why careful keyword choice can help you avoid the extensive workassociated with highly competitive search phrases
How PageRank is calculated
On a simple level we can tell quite a lot about how PageRank is calculated Thisis because when Google was just a university research project the creators ofPageRank published a paper that detailed a formula for calculating it Thisformula is now more than a handful of years old and we suspect that it haschanged somewhat since then However for detailing the over-riding principlesof how PageRank works it is as accurate today as the day it was written
PR(A) = (1-d) + d (PR(T1) C(T1) + + PR(Tn) C(Tn) )
Where PR(A) is the PageRank of Page A ( the one we want t owork out )
D is a dam pening factor Nom inally this is set to 085
PR(T1) is the PageRank of a site point ing to Page A
C(T1) is the num ber of links off that page
PR(Tn) C(Tn) m eans we do that for each page point ing toPage A
Source The Anatomy of a Large-Scale Hypertextual Web Search EngineSergey Brin and LawrencePage httpwww-dbstanfordedu~backrubgooglehtml
Couldnrsquot be simpler right Depending on your mathematical competence this iseither remarkably easy to understand or remarkably complex So herersquos the dealndash this isnrsquot a one-time calculation It looks pretty above but you canrsquot just do onesimple calculation and get an answer with the PageRank of a page If you look atit to calculate the PageRank of Page A we need to know the PageRank of allthe pages pointing to it To calculate the PageRank of those pages we need toknow the PageRank of all the pages pointing to them (one of which could and isquite likely to be Page A) If this seems like it could go around in circles forever
13
thatrsquos because it nearly does We have to do a whole lot of little calculationsmany times over before we actually get the answer What this formula tells us isthat however you slice it and however the formula may have changed since itwas written
The PageRank given to Page A by a Page B pointing to it is decreased witheach link to anywhere that exists on Page B This means that a pagersquos
PageRank is essentially a measure of its vote it can split that vote between onelink or two or many more but its overall voting power will always remain the
same
This really is important So to be totally clear letrsquos put some numbers to it (thesenumbers are made up for demonstration purposes only and have no relevancyto any particular page) Say Page B has a PageRank value of 5 and has a singlelink on it pointing to Page A Page Arsquos PageRank is improved by a proportion ofPage Brsquos value of 5 (Page B doesnrsquot lose anything but Page A gains) If Page Bhas two links that PageRank improvement would be split and Page A wouldonly gain half the PageRank that it did before
Now put the formula out of your mind for a moment as its easier to understandhow it works using a diagram Letrsquos say we have a hypothetical set of pagesimaginatively titled Page A Page B Page C and Page D They link to each otheras shown below
To begin with in our example at least we donrsquot know what the pagersquos startingPageRanks are Therersquos nothing special about the number that we pick to startwith (in fact if you read forward to the section on convergence ndash you can see that
14
we can start at any number we want) Since in the last version of this paper weperformed the calculations by setting these values to 1 ndash wersquore going to set themto zero this time around in order to prove that it doesnrsquot matter what the startingvalues are
Next we perform the necessary calculation to obtain the PageRank for eachpage The rules are
1 We take a 085 a pagersquos PageRank and divide it by the number oflinks on the page
2 We add that amount on to a new total for each page itrsquos being passedto
3 We add 015 to each of those totals
The first calculation is easy Because wersquove started at zero ndash 0 085 is always 0So each page gets just 015 + 0 Meaning each page now has a PageRank of015 Clearly wersquore not done ndash we want to show the importance of each pagebased upon links and theyrsquore all the same so we need to run the calculationagain
Page A links to pages B C and D Page Arsquos PageRank is 015 so it will add 085 015 = 01275 to the new PageRank scores of the pages it links to There arethree of them so they each get 00425
Page B links to page C Page Brsquos PageRank is 015 so it will add 085 015 =01275 to the new PageRank score of the pages it links to Since it only links topage C page C will get it all
Page C links to Page A all 01275 passes to page A
Page D links to Page C Again all 01275 passes to page C
The new totals for each page them become
Page A 015 (base) + 01275 (from Page C) = 02775Page B 015 (base) + 00425 (from Page A) = 01925Page C 015 (base) + 00425 (from Page A) + 01275 (from Page B) + 01275(from Page D) = 04475Page D 015 (base) + 00425 (from Page A) = 01925
So wersquove got
15
Pretty neat huh Already wersquore begining to see that Page C is probably the mostimportant page in the system (but we canrsquot be sure yet ndash it could well change)We carry on doing these calculations until the value for each page no longerchanges (this is called convergence ndash therersquos more about this in the nextsection) In practice Google probably doesnt wait for this convergence butinstead run a number of iterations of the calculation which is likely to give themfairly accurate values (more on this later as well) If we carried out all thecalculations for the example given it would take us 143 calculations [ ExcelExample 1] and wersquod reach final values of
16
As suspected Page C is the most important If we take a quick look at these rawvalues we can see something about the number of links pointing out from a pageLook at Page A which has a link from a high PageRank page (Page C) whichhas only one outbound link Then look at Page B and D both share links from ahigh PageRank page (Page A) with three outbound links The number of linkssignificantly alters the way PageRank is distributed
A word on convergence
Convergence is an important mathematical aspect of PageRank which allowsGoogle to provided unprecedented search quality at comparably low costs Thisis a semi-complex topic but it is important to your understanding of how and whyPageRank works Wersquove tried to make it as simple as possible but unless yoursquoreSergey Brin or Larry Page yoursquoll still need to concentrate
Whilst it will take some concentration it isnrsquot that hard to understand
Wersquove shown that the outlet values (final values) of one stage of the calculationbecome the inlet values (starting values) of the next stage and that we keep ondoing this (itrsquos known as a recursive procedure) But the really big question ishow and when does the recursive procedure stop
The answer is ldquoconvergencerdquo Provided the dampening factor (d in our equation)is less than one then convergence will occur Nominally we set it to 085(because thatrsquos the value mentioned in the Stanford papers)
This convergence basically means that whatever values we start at after runningthe calculation a number of times we will end up with the same final values andthat these values will no longer change if we do further iterations of thecalculation These final values are known as limiting values
Once the limiting values have been reached Google no longer needs to expendprocessing power on calculating the PageRank They can finish there
This is easier to understand with an example Letrsquos take a look at the followingstructure
17
PageRank for pages A B C D at various stages of iteration
A B C DIteration 1 1 1 1
1 10000000000 05750000000 22750000000 015000000002 20837500000 05750000000 11912500000 015000000003 11625625000 10355937500 16518437500 01500000000
19 14900124031 07833688246 15766187723 0150000000020 14901259564 07832552713 15766187723 0150000000021 14901259564 07833035315 15765705121 0150000000046 14901074052 07832956473 15765969475 0150000000047 14901074054 07832956472 15765969474 0150000000048 14901074053 07832956473 15765969474 0150000000049 14901074053 07832956473 15765969474 0150000000050 14901074053 07832956473 15765969474 0150000000051 14901074053 07832956473 15765969474 0150000000052 14901074053 07832956473 15765969474 0150000000053 14901074053 07832956473 15765969474 0150000000054 14901074053 07832956473 15765969474 01500000000
No matter how much we try to calculate the value past the 48th iteration thevalues do not change ndash they have converged at the limiting values
In practice theres no need to wait until they do not change at all before we stopcalculations We simply wait until they do not change very much In our exampleabove we have done that to 10 decimal places Ie we have calculated until theydo not change by 00000000001 or more (or mathematically | PR 48 (A) - PR 50(A) | lt 10 ndash10)
18
Derived specifics of how Google calculates PageRank
We started our values from zero and now see that the calculations are fairlycomplicated and resource intensive So two questions arise When Googlecalculates PageRank themselves do they start each month with an arbitraryvalue And secondly do they calculate PageRank across several machines Wecan answer these questions by a couple of simple experiments Assume thathaving calculated the PageRanks in the structure shown earlier we would like toadd a page and change a link to make the structure look like this
If we start each page from an arbitrary starting value of one then how manyiterations of the calculation does it take to achieve convergence [ ExcelExample 2]
75
Now if we start each page from the values we ended up with when we calculatedthe PageRank of the original structure (and set Page Ersquos starting value to 1) thenhow many iterations does it take to achieve convergence [ Excel Example 3]
78
These numbers are really very close Therefore it would appear logical thatGoogle does not reset the values with each calculation as they may simply carryon from where they left off
So do they calculate across several machines or one Obviously there is toomuch work for one machine when you get to billions of pages so several
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
4
guesstimate is worthless for our purposes because it isnrsquot featured inany of the PageRank calculations The only way to tell if the toolbar isa guesstimate is to type the URL into the Google search box and see ifthe page shows up in the SERPS If it doesnrsquot then the toolbar isguessing
2 The toolbar is just a representation of actual PageRank WhilstPageRank is linear Google has chosen to use a non-linear graph toportray it So on the toolbar to move from a PageRank of 2 to aPageRank of 3 takes less of an increase than to move from aPageRank of 3 to a PageRank of 4 A comparison table best illustratesthis phenomenon The actual figures are kept secret so wersquoll just useany figures for demonstration purposes
If the actualPageRank is
betweenThe Toolbar Shows
000000001 and 5 16 and 25 2
25 and 125 3126 and 625 4
626 and 3125 53126 and 15625 6
15626 and 78125 778126 and 390625 8
390626 and 1953125 91953126 and infinity 10
The PageRank shown in the Google directory (httpdirectorygooglecom) suffers fromthe same problems The PageRank shown in the directory is also on a differentscale There have been attempts to cross-reference these two scales butbecause they are non-linear the results really do not tell you anything more thanyou already know
Also of note is that a programmer managed to generate a tool to look upPageRank without using Internet Explorer This tool has since been withdrawnbut whilst originally the numbers given by this software and Googlersquos toolbarmatched - presently querying with such software sometimes produces differentnumbers than querying with the toolbar This is Googlersquos right to protect theirdata but is the strongest indication that
5
Sometimes what you see on the Google toolbar may not be related to actualPageRank at all (Google can and does assign whatever toolbar PageRankvalue they want to assign to a page)
How significant is PageRank
The significance of any one factor in search engine algorithms depends on thequality of the information it supplies A factorrsquos importance is known as its weightTo demonstrate how weighting is arrived at its easiest to move away fromPageRank for a second and look at Meta tags Originally when the Metakeyword tag was new you could write something like this in your document
ltmeta name=rdquokeywordsrdquo content=rdquopagerank pagerank uncovered algorithmalgorithmsrdquogt
In theory the Meta keyword tag was a very good indicator of what the page wasabout However as most are well aware ndash the weighting for the keywords tag isfast approaching nothing Two things have contributed to this
1 The ease at which Webmasters can manipulate it2 The level of manipulation by Webmasters
These two things are separate factors but with human nature being what it isthe easier something is to influence ndash the more it is manipulated Thecombination of these factors determines the ldquoweightingrdquo ndash ie how much we trustthe information provided by that factor
So it makes sense to look at these factors in relation to PageRank first
PageRank is without doubt one of the hardest things for a Webmaster tomanipulate ethically However it is possible to generate links to your site fromother sites fairly simply through the use of link farms and guestbooks Googlefrowns upon this kind of abuse and many sites that have tried this have had theirPageRank influence blocked But it must be said that the abuse is still rampantand that it can have an influence on PageRank So whilst not easy to doPageRank is still subject to manipulation
The extent to which PageRank is manipulated has also changed Most people nolonger believe Googlersquos old line of people not being able to influence PageRankand the results based on it However there is more information about PageRankavailable than ever and people are more aware of manipulation techniques
So whilst PageRank is valuable you should be careful not to over-estimate itsusage and capabilities Your final ranking in Google is due to a mix of factors of
6
which PageRank is only one Wersquoll get into more details later by discussing howPageRank is different than the other ranking factors and thus when it appliesand when it doesnrsquot Ironically enough PageRankrsquos weighting factor isundeniably declining Since the original version of this paper gave out detailedinformation about PageRank in all likelihood it may also have contributed insome small way to the decline in weighting of the very subject it talks about
Is PageRank a good determination of the quality of a page
To examine the worth of PageRank we need to first look at its premise and howaccurate it is Basically PageRank says
1 If a page links to another page it is casting a vote which indicates that theother page is good
2 If lots of pages link to a page then it has more votes and its worth shouldbe higher
The basic implication here is People only link to pages they think are good
It shouldnt be hard to convince you that this premise is wrong A few of thereasons people link to pages other than ones they think are good are
1) Reciprocal links ndash ldquoLink to me and Irsquoll link to yourdquo
2) Link Requirements ndash ldquoUsing our script requires you to put a link to ourpagerdquo or ldquoWersquoll give you an award solely because you link to our pagerdquo
3) Friends and Family ndash ldquoThis is my friend Petersquos siterdquo or ldquoMy mumrsquos siteis here my dadrsquos site is here My dogrsquos site is hererdquo
4) Free Page Add-ons ndash ldquoThis counter was provided bywwwlinktocountersitecomrdquo
Furthermore anybody who has a top-ranking site will tell you that it tends to getlinks from new sites This is not necessarily because its good (although theygenerally are) Assume a Webmaster is setting up a new site and they arelooking for some outbound links Nowadays one of the first things they do is aGoogle search for similar sites The links they end up with may not necessarilybe the best sites but merely the easiest ones to find If PageRank influencesrankings and if they subsequently link to those pages ndash the new Webmaster willbe adding to the inaccuracies in the judging of the quality of a page The same istrue when these new Webmasters use the Google Toolbar PageRank indicator tochoose whom to link to
7
To put this another way
PageRank is determined by the links pointing to a page But if PageRank itselfhas an influence on the number of links to a page it is influencing (in acircular way) the quality of that page The links are no longer based solely onhuman judgement If a Webmaster picks their outbound links by searching onGoogle or by looking at the Google toolbar (even if this is only part of theirjudgement) then there is a corresponding increase in a pagersquos PageRankThis increase is not solely because it is a good page but because itsPageRank is already high
Googlersquos First 1000 Results
Remember PageRank alone cannot get you high rankings Wersquove mentionedbefore that PageRank is a multiplier so if your score for all other factors is 0 andyour PageRank is twenty billion then you still score 0 (last in the results) This isnot to say PageRank is worthless but there is some confusion over whenPageRank is useful and when it is not This leads to many misinterpretations ofits worth The only way to clear up these misinterpretations is to point out whenPageRank is not worthwhile
If you perform any broad search on Google it will appear as if youve foundseveral thousand results However you can only view the first 1000 of themUnderstanding why this is so explains why you should always concentrate onldquoon the pagerdquo factors and anchor text first and PageRank last
Assume that you perform a search on Google and it returns 200000 results Ifwe were to calculate every factor for each 200000 pages ndash do you think it wouldreally take just 034 seconds to search The answer to speeding up the search isto get a subset of documents that are most likely to be related to the query Thissubset of documents needs to be larger than the number of search results Forexample letrsquos say that number is 2000 What the search engine does is querythe whole database using 2 or 3 factors finding the 2000 documents that rankhighest for them (Remember there were 200000 possible documents andthatrsquos the number that actually gets shown) Then the engine applies all thefactors to those 2000 and ranks them accordingly Because therersquos a drop in thequality of the results (not the pages) at the bottom of this subset the engine justshows the first 1000 PageRank is almost certainly not one of those factorsNotice how before we highlighted the word ldquorelatedrdquo in creating the subset of2000 pages The search engine is looking for pages that are on-topic If weincluded PageRank in that list wersquod get a lot of high PageRank pages with topicsthat are only slightly related (because of the second factor) but thatrsquos not whatwe want
8
Why this is critical
You must do enough on the page work andor anchor text work to get intothat subset of 2000 pages for your chosen key phrase otherwise your highPageRank will be completely in vain PageRank means nothing if you do nothave enough ranking from other factors to make it into the first subset
The difference between PageRank and other factors
To assess when PageRank is important and when it is not we need tounderstand how PageRank is different from all other ranking factors To do thisherersquos a quick table that lists a few other factors and how they add to the rankingscore
Title Tag Can only be listed onceKeywords in Body Text Each successive repetition is
less important Proximity isimportant
Anchor Text Highly weighted but likekeywords in body text there isa cut off point where furtheranchor text is no longerworthwhile
PageRank Potentially infinite You arealways capable of increasingyour PageRank significantlybut it takes work
All other ranking factors have cut off points beyond which they will no longeradd to your ranking score or will not add significantly enough for it to beworthwhile PageRank has no cut off point
Non-PageRank Factor Threshold
Having written about the difference between PageRank and other factors andhow PageRank is harder to get it should now be clear that whilst we could usemany methods to get good rankings there is a threshold which defines whenhigh PageRank is worth striving for and when it is not
With ranking factors other than PageRank there is a score beyond which theslow down in the rate that any factor adds to this score is so insignificant that it is
9
not worthwhile This is the Non-PageRank Factor Threshold To illustrate thisletrsquos put an example figure on this of 1000
If we have a query where the results are Page A and Page B then Page A and Bhave scores for that query which are the total scores for all ranking factors(including PageRank) Letrsquos say Page Arsquos score is 900 and page Brsquos score is500 Obviously Page A will be listed first These are both below our hypotheticalNon-PageRank Factor Threshold thus without any change in PageRank it ispossible for page B to improve their optimization to beat Page A for this particularquery There are lots of queries like this on Google theyre more commonlythought of as less competitive queries
Now assume Page A raises its score to 1100 Suddenly page B cannot competein the SERPs (search engine results pages) without increasing its PageRank Inall probability page B must also improve for all the other ranking factors but anincrease in PageRank is almost certainly necessary There are also lots ofqueries like this on Google which are more commonly thought of as morecompetitive queries
Generally when querying Google the group of pages in the SERPs will containsome pages that have a score above the Non-PageRank Factor Threshold andsome that do not
There is an important point to be made here
To be competitive you must raise your pages search engine ranking scorebeyond the Non-PageRank Factor Threshold To fail to do so means that youcan easily be beaten in the search results for your query terms The quickestway to approach the Non-PageRank Factor Threshold is through ldquoon the pagefactorsrdquo however you cannot move above the Non-PageRank FactorThreshold without PageRank
The obvious question is whatrsquos the numerical value of the Non-PageRank FactorThreshold and how much work do you need to do to get past it The answer isthat it has no value it is a hypothetical line Google could put a value on it butthat would not help us unless we know what the pagersquos individual scores are Weneed only be aware that the threshold exists and that it gives us informationabout principles
Using the threshold to derive the worth of two ranking strategies
The threshold explains principles and the different ways that search enginemarketers work It also demonstrates why some of the misunderstandings aboutPageRank occur Letrsquos consider the strategies of two people Person A considersPageRank to be unimportant and Person B considers PageRank to be veryimportant
10
Person A says ldquoPageRankrdquo is unimportant They have optimised pages foryears and know how to use ldquoon the pagerdquo factors very successfully Theyunderstand the basics of anchor text but they couldnrsquot care at all aboutPageRank
Whatrsquos happening person A is reaching the Non-PageRank Factor Thresholdvery quickly because they are maximising the ldquoon the pagerdquo factors Throughcarefully choosing keywords they jump-start themselves up the SERPs Aslong as their content is good high-ranking sites (over time) tend to get linkedto Whilst they didnrsquot directly ask for it a slow trickle of sites will begin to link tothem and give them PageRank which helps consolidates their position
Person B says ldquoPageRankrdquo is important Wersquove all seen those pages in theresults that have no content but great rankings (With big brands this canoften occur naturally even when they have no idea what PageRank is Thiswould be Person C who is not relevant to the discussion at hand) Person Bunderstands lots about PageRank and concentrates heavily on it
Whatrsquos happening person B is doing the reverse of person A Whilst person Aconcentrated on the Non-PageRank factors and found herself gettingPageRank anyway person B concentrates on the PageRank Factor and findshimself getting Non-PageRank factors The reason for this is that increasingPageRank requires links and links have anchor text Thus through carefullychoosing the anchor text linking to his page person B automatically increaseshis Non-PageRank factor scores whilst obtaining his high PageRank score
Obviously these are two extremes but we can use these to extrapolate theadvantages and disadvantages of each approach
11
Advantages DisadvantagesPerson A bull Quick entry to the
results pagesbull Self-generating links
limit the amount ofwork needed
bull Securing the resultsto make it moredifficult forcompetitors tocompete takes morework
bull Slower to react tonew competitors
Person B bull Solid position Caneasily modify on thepage factors for aquick boost ifrequired
bull Probably highertraffic from non-search enginesources
bull Slower entry toresults pages
bull Difficult to do wellbull Increased likelihood
of tripping spamfilters
It is clear that both strategies can and do work Both strategies are usingPageRank as part the mix of factors that will ultimately improve their ranking inthe SERPS Because there is such a mix we can use them to different degreesdepending on the strategy that best suits your style My personal strategy is touse a combination but to save some of the ldquoon the pagerdquo factors for later in caseI need a quick boost if the competition heats up at a later time
Really heavy competition
No explanation of PageRank strategies would be complete without a finalstatement regarding heavy keyword competition There are some queries wherecompetition is so intense that you must do everything possible to maximize yourranking score (for example ldquoweb hostingrdquo) In such situations it is impossible torank highly through Non-PageRank factors alone (as you will not initially be listedhigh enough to be noticed and linked to) That is not to say that Non-PageRankfactors arent important Consider what your final rank score is
Final Rank Score = (score for all Non-PageRank factors) x (actual PageRankscore)
Improving either side of the equation can have a positive effect Howeverbecause the Non-PageRank factors have a restricted maximum benefit theactual PageRank score must be improved in order to compete successfullyUnder really heavy competition ndash it holds true that you cannot rank well unlessyour actual PageRank score is above a certain level In other words
12
There exists a query specific ldquoMinimum PageRank levelrdquo For queries that do nothave heavy competition this level is easy to achieve without even tryingHowever where heavy competition exists Non-PageRank factors are just asimportant (and easier to get) until they reach the Non-PageRank factor thresholdThis is why careful keyword choice can help you avoid the extensive workassociated with highly competitive search phrases
How PageRank is calculated
On a simple level we can tell quite a lot about how PageRank is calculated Thisis because when Google was just a university research project the creators ofPageRank published a paper that detailed a formula for calculating it Thisformula is now more than a handful of years old and we suspect that it haschanged somewhat since then However for detailing the over-riding principlesof how PageRank works it is as accurate today as the day it was written
PR(A) = (1-d) + d (PR(T1) C(T1) + + PR(Tn) C(Tn) )
Where PR(A) is the PageRank of Page A ( the one we want t owork out )
D is a dam pening factor Nom inally this is set to 085
PR(T1) is the PageRank of a site point ing to Page A
C(T1) is the num ber of links off that page
PR(Tn) C(Tn) m eans we do that for each page point ing toPage A
Source The Anatomy of a Large-Scale Hypertextual Web Search EngineSergey Brin and LawrencePage httpwww-dbstanfordedu~backrubgooglehtml
Couldnrsquot be simpler right Depending on your mathematical competence this iseither remarkably easy to understand or remarkably complex So herersquos the dealndash this isnrsquot a one-time calculation It looks pretty above but you canrsquot just do onesimple calculation and get an answer with the PageRank of a page If you look atit to calculate the PageRank of Page A we need to know the PageRank of allthe pages pointing to it To calculate the PageRank of those pages we need toknow the PageRank of all the pages pointing to them (one of which could and isquite likely to be Page A) If this seems like it could go around in circles forever
13
thatrsquos because it nearly does We have to do a whole lot of little calculationsmany times over before we actually get the answer What this formula tells us isthat however you slice it and however the formula may have changed since itwas written
The PageRank given to Page A by a Page B pointing to it is decreased witheach link to anywhere that exists on Page B This means that a pagersquos
PageRank is essentially a measure of its vote it can split that vote between onelink or two or many more but its overall voting power will always remain the
same
This really is important So to be totally clear letrsquos put some numbers to it (thesenumbers are made up for demonstration purposes only and have no relevancyto any particular page) Say Page B has a PageRank value of 5 and has a singlelink on it pointing to Page A Page Arsquos PageRank is improved by a proportion ofPage Brsquos value of 5 (Page B doesnrsquot lose anything but Page A gains) If Page Bhas two links that PageRank improvement would be split and Page A wouldonly gain half the PageRank that it did before
Now put the formula out of your mind for a moment as its easier to understandhow it works using a diagram Letrsquos say we have a hypothetical set of pagesimaginatively titled Page A Page B Page C and Page D They link to each otheras shown below
To begin with in our example at least we donrsquot know what the pagersquos startingPageRanks are Therersquos nothing special about the number that we pick to startwith (in fact if you read forward to the section on convergence ndash you can see that
14
we can start at any number we want) Since in the last version of this paper weperformed the calculations by setting these values to 1 ndash wersquore going to set themto zero this time around in order to prove that it doesnrsquot matter what the startingvalues are
Next we perform the necessary calculation to obtain the PageRank for eachpage The rules are
1 We take a 085 a pagersquos PageRank and divide it by the number oflinks on the page
2 We add that amount on to a new total for each page itrsquos being passedto
3 We add 015 to each of those totals
The first calculation is easy Because wersquove started at zero ndash 0 085 is always 0So each page gets just 015 + 0 Meaning each page now has a PageRank of015 Clearly wersquore not done ndash we want to show the importance of each pagebased upon links and theyrsquore all the same so we need to run the calculationagain
Page A links to pages B C and D Page Arsquos PageRank is 015 so it will add 085 015 = 01275 to the new PageRank scores of the pages it links to There arethree of them so they each get 00425
Page B links to page C Page Brsquos PageRank is 015 so it will add 085 015 =01275 to the new PageRank score of the pages it links to Since it only links topage C page C will get it all
Page C links to Page A all 01275 passes to page A
Page D links to Page C Again all 01275 passes to page C
The new totals for each page them become
Page A 015 (base) + 01275 (from Page C) = 02775Page B 015 (base) + 00425 (from Page A) = 01925Page C 015 (base) + 00425 (from Page A) + 01275 (from Page B) + 01275(from Page D) = 04475Page D 015 (base) + 00425 (from Page A) = 01925
So wersquove got
15
Pretty neat huh Already wersquore begining to see that Page C is probably the mostimportant page in the system (but we canrsquot be sure yet ndash it could well change)We carry on doing these calculations until the value for each page no longerchanges (this is called convergence ndash therersquos more about this in the nextsection) In practice Google probably doesnt wait for this convergence butinstead run a number of iterations of the calculation which is likely to give themfairly accurate values (more on this later as well) If we carried out all thecalculations for the example given it would take us 143 calculations [ ExcelExample 1] and wersquod reach final values of
16
As suspected Page C is the most important If we take a quick look at these rawvalues we can see something about the number of links pointing out from a pageLook at Page A which has a link from a high PageRank page (Page C) whichhas only one outbound link Then look at Page B and D both share links from ahigh PageRank page (Page A) with three outbound links The number of linkssignificantly alters the way PageRank is distributed
A word on convergence
Convergence is an important mathematical aspect of PageRank which allowsGoogle to provided unprecedented search quality at comparably low costs Thisis a semi-complex topic but it is important to your understanding of how and whyPageRank works Wersquove tried to make it as simple as possible but unless yoursquoreSergey Brin or Larry Page yoursquoll still need to concentrate
Whilst it will take some concentration it isnrsquot that hard to understand
Wersquove shown that the outlet values (final values) of one stage of the calculationbecome the inlet values (starting values) of the next stage and that we keep ondoing this (itrsquos known as a recursive procedure) But the really big question ishow and when does the recursive procedure stop
The answer is ldquoconvergencerdquo Provided the dampening factor (d in our equation)is less than one then convergence will occur Nominally we set it to 085(because thatrsquos the value mentioned in the Stanford papers)
This convergence basically means that whatever values we start at after runningthe calculation a number of times we will end up with the same final values andthat these values will no longer change if we do further iterations of thecalculation These final values are known as limiting values
Once the limiting values have been reached Google no longer needs to expendprocessing power on calculating the PageRank They can finish there
This is easier to understand with an example Letrsquos take a look at the followingstructure
17
PageRank for pages A B C D at various stages of iteration
A B C DIteration 1 1 1 1
1 10000000000 05750000000 22750000000 015000000002 20837500000 05750000000 11912500000 015000000003 11625625000 10355937500 16518437500 01500000000
19 14900124031 07833688246 15766187723 0150000000020 14901259564 07832552713 15766187723 0150000000021 14901259564 07833035315 15765705121 0150000000046 14901074052 07832956473 15765969475 0150000000047 14901074054 07832956472 15765969474 0150000000048 14901074053 07832956473 15765969474 0150000000049 14901074053 07832956473 15765969474 0150000000050 14901074053 07832956473 15765969474 0150000000051 14901074053 07832956473 15765969474 0150000000052 14901074053 07832956473 15765969474 0150000000053 14901074053 07832956473 15765969474 0150000000054 14901074053 07832956473 15765969474 01500000000
No matter how much we try to calculate the value past the 48th iteration thevalues do not change ndash they have converged at the limiting values
In practice theres no need to wait until they do not change at all before we stopcalculations We simply wait until they do not change very much In our exampleabove we have done that to 10 decimal places Ie we have calculated until theydo not change by 00000000001 or more (or mathematically | PR 48 (A) - PR 50(A) | lt 10 ndash10)
18
Derived specifics of how Google calculates PageRank
We started our values from zero and now see that the calculations are fairlycomplicated and resource intensive So two questions arise When Googlecalculates PageRank themselves do they start each month with an arbitraryvalue And secondly do they calculate PageRank across several machines Wecan answer these questions by a couple of simple experiments Assume thathaving calculated the PageRanks in the structure shown earlier we would like toadd a page and change a link to make the structure look like this
If we start each page from an arbitrary starting value of one then how manyiterations of the calculation does it take to achieve convergence [ ExcelExample 2]
75
Now if we start each page from the values we ended up with when we calculatedthe PageRank of the original structure (and set Page Ersquos starting value to 1) thenhow many iterations does it take to achieve convergence [ Excel Example 3]
78
These numbers are really very close Therefore it would appear logical thatGoogle does not reset the values with each calculation as they may simply carryon from where they left off
So do they calculate across several machines or one Obviously there is toomuch work for one machine when you get to billions of pages so several
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
5
Sometimes what you see on the Google toolbar may not be related to actualPageRank at all (Google can and does assign whatever toolbar PageRankvalue they want to assign to a page)
How significant is PageRank
The significance of any one factor in search engine algorithms depends on thequality of the information it supplies A factorrsquos importance is known as its weightTo demonstrate how weighting is arrived at its easiest to move away fromPageRank for a second and look at Meta tags Originally when the Metakeyword tag was new you could write something like this in your document
ltmeta name=rdquokeywordsrdquo content=rdquopagerank pagerank uncovered algorithmalgorithmsrdquogt
In theory the Meta keyword tag was a very good indicator of what the page wasabout However as most are well aware ndash the weighting for the keywords tag isfast approaching nothing Two things have contributed to this
1 The ease at which Webmasters can manipulate it2 The level of manipulation by Webmasters
These two things are separate factors but with human nature being what it isthe easier something is to influence ndash the more it is manipulated Thecombination of these factors determines the ldquoweightingrdquo ndash ie how much we trustthe information provided by that factor
So it makes sense to look at these factors in relation to PageRank first
PageRank is without doubt one of the hardest things for a Webmaster tomanipulate ethically However it is possible to generate links to your site fromother sites fairly simply through the use of link farms and guestbooks Googlefrowns upon this kind of abuse and many sites that have tried this have had theirPageRank influence blocked But it must be said that the abuse is still rampantand that it can have an influence on PageRank So whilst not easy to doPageRank is still subject to manipulation
The extent to which PageRank is manipulated has also changed Most people nolonger believe Googlersquos old line of people not being able to influence PageRankand the results based on it However there is more information about PageRankavailable than ever and people are more aware of manipulation techniques
So whilst PageRank is valuable you should be careful not to over-estimate itsusage and capabilities Your final ranking in Google is due to a mix of factors of
6
which PageRank is only one Wersquoll get into more details later by discussing howPageRank is different than the other ranking factors and thus when it appliesand when it doesnrsquot Ironically enough PageRankrsquos weighting factor isundeniably declining Since the original version of this paper gave out detailedinformation about PageRank in all likelihood it may also have contributed insome small way to the decline in weighting of the very subject it talks about
Is PageRank a good determination of the quality of a page
To examine the worth of PageRank we need to first look at its premise and howaccurate it is Basically PageRank says
1 If a page links to another page it is casting a vote which indicates that theother page is good
2 If lots of pages link to a page then it has more votes and its worth shouldbe higher
The basic implication here is People only link to pages they think are good
It shouldnt be hard to convince you that this premise is wrong A few of thereasons people link to pages other than ones they think are good are
1) Reciprocal links ndash ldquoLink to me and Irsquoll link to yourdquo
2) Link Requirements ndash ldquoUsing our script requires you to put a link to ourpagerdquo or ldquoWersquoll give you an award solely because you link to our pagerdquo
3) Friends and Family ndash ldquoThis is my friend Petersquos siterdquo or ldquoMy mumrsquos siteis here my dadrsquos site is here My dogrsquos site is hererdquo
4) Free Page Add-ons ndash ldquoThis counter was provided bywwwlinktocountersitecomrdquo
Furthermore anybody who has a top-ranking site will tell you that it tends to getlinks from new sites This is not necessarily because its good (although theygenerally are) Assume a Webmaster is setting up a new site and they arelooking for some outbound links Nowadays one of the first things they do is aGoogle search for similar sites The links they end up with may not necessarilybe the best sites but merely the easiest ones to find If PageRank influencesrankings and if they subsequently link to those pages ndash the new Webmaster willbe adding to the inaccuracies in the judging of the quality of a page The same istrue when these new Webmasters use the Google Toolbar PageRank indicator tochoose whom to link to
7
To put this another way
PageRank is determined by the links pointing to a page But if PageRank itselfhas an influence on the number of links to a page it is influencing (in acircular way) the quality of that page The links are no longer based solely onhuman judgement If a Webmaster picks their outbound links by searching onGoogle or by looking at the Google toolbar (even if this is only part of theirjudgement) then there is a corresponding increase in a pagersquos PageRankThis increase is not solely because it is a good page but because itsPageRank is already high
Googlersquos First 1000 Results
Remember PageRank alone cannot get you high rankings Wersquove mentionedbefore that PageRank is a multiplier so if your score for all other factors is 0 andyour PageRank is twenty billion then you still score 0 (last in the results) This isnot to say PageRank is worthless but there is some confusion over whenPageRank is useful and when it is not This leads to many misinterpretations ofits worth The only way to clear up these misinterpretations is to point out whenPageRank is not worthwhile
If you perform any broad search on Google it will appear as if youve foundseveral thousand results However you can only view the first 1000 of themUnderstanding why this is so explains why you should always concentrate onldquoon the pagerdquo factors and anchor text first and PageRank last
Assume that you perform a search on Google and it returns 200000 results Ifwe were to calculate every factor for each 200000 pages ndash do you think it wouldreally take just 034 seconds to search The answer to speeding up the search isto get a subset of documents that are most likely to be related to the query Thissubset of documents needs to be larger than the number of search results Forexample letrsquos say that number is 2000 What the search engine does is querythe whole database using 2 or 3 factors finding the 2000 documents that rankhighest for them (Remember there were 200000 possible documents andthatrsquos the number that actually gets shown) Then the engine applies all thefactors to those 2000 and ranks them accordingly Because therersquos a drop in thequality of the results (not the pages) at the bottom of this subset the engine justshows the first 1000 PageRank is almost certainly not one of those factorsNotice how before we highlighted the word ldquorelatedrdquo in creating the subset of2000 pages The search engine is looking for pages that are on-topic If weincluded PageRank in that list wersquod get a lot of high PageRank pages with topicsthat are only slightly related (because of the second factor) but thatrsquos not whatwe want
8
Why this is critical
You must do enough on the page work andor anchor text work to get intothat subset of 2000 pages for your chosen key phrase otherwise your highPageRank will be completely in vain PageRank means nothing if you do nothave enough ranking from other factors to make it into the first subset
The difference between PageRank and other factors
To assess when PageRank is important and when it is not we need tounderstand how PageRank is different from all other ranking factors To do thisherersquos a quick table that lists a few other factors and how they add to the rankingscore
Title Tag Can only be listed onceKeywords in Body Text Each successive repetition is
less important Proximity isimportant
Anchor Text Highly weighted but likekeywords in body text there isa cut off point where furtheranchor text is no longerworthwhile
PageRank Potentially infinite You arealways capable of increasingyour PageRank significantlybut it takes work
All other ranking factors have cut off points beyond which they will no longeradd to your ranking score or will not add significantly enough for it to beworthwhile PageRank has no cut off point
Non-PageRank Factor Threshold
Having written about the difference between PageRank and other factors andhow PageRank is harder to get it should now be clear that whilst we could usemany methods to get good rankings there is a threshold which defines whenhigh PageRank is worth striving for and when it is not
With ranking factors other than PageRank there is a score beyond which theslow down in the rate that any factor adds to this score is so insignificant that it is
9
not worthwhile This is the Non-PageRank Factor Threshold To illustrate thisletrsquos put an example figure on this of 1000
If we have a query where the results are Page A and Page B then Page A and Bhave scores for that query which are the total scores for all ranking factors(including PageRank) Letrsquos say Page Arsquos score is 900 and page Brsquos score is500 Obviously Page A will be listed first These are both below our hypotheticalNon-PageRank Factor Threshold thus without any change in PageRank it ispossible for page B to improve their optimization to beat Page A for this particularquery There are lots of queries like this on Google theyre more commonlythought of as less competitive queries
Now assume Page A raises its score to 1100 Suddenly page B cannot competein the SERPs (search engine results pages) without increasing its PageRank Inall probability page B must also improve for all the other ranking factors but anincrease in PageRank is almost certainly necessary There are also lots ofqueries like this on Google which are more commonly thought of as morecompetitive queries
Generally when querying Google the group of pages in the SERPs will containsome pages that have a score above the Non-PageRank Factor Threshold andsome that do not
There is an important point to be made here
To be competitive you must raise your pages search engine ranking scorebeyond the Non-PageRank Factor Threshold To fail to do so means that youcan easily be beaten in the search results for your query terms The quickestway to approach the Non-PageRank Factor Threshold is through ldquoon the pagefactorsrdquo however you cannot move above the Non-PageRank FactorThreshold without PageRank
The obvious question is whatrsquos the numerical value of the Non-PageRank FactorThreshold and how much work do you need to do to get past it The answer isthat it has no value it is a hypothetical line Google could put a value on it butthat would not help us unless we know what the pagersquos individual scores are Weneed only be aware that the threshold exists and that it gives us informationabout principles
Using the threshold to derive the worth of two ranking strategies
The threshold explains principles and the different ways that search enginemarketers work It also demonstrates why some of the misunderstandings aboutPageRank occur Letrsquos consider the strategies of two people Person A considersPageRank to be unimportant and Person B considers PageRank to be veryimportant
10
Person A says ldquoPageRankrdquo is unimportant They have optimised pages foryears and know how to use ldquoon the pagerdquo factors very successfully Theyunderstand the basics of anchor text but they couldnrsquot care at all aboutPageRank
Whatrsquos happening person A is reaching the Non-PageRank Factor Thresholdvery quickly because they are maximising the ldquoon the pagerdquo factors Throughcarefully choosing keywords they jump-start themselves up the SERPs Aslong as their content is good high-ranking sites (over time) tend to get linkedto Whilst they didnrsquot directly ask for it a slow trickle of sites will begin to link tothem and give them PageRank which helps consolidates their position
Person B says ldquoPageRankrdquo is important Wersquove all seen those pages in theresults that have no content but great rankings (With big brands this canoften occur naturally even when they have no idea what PageRank is Thiswould be Person C who is not relevant to the discussion at hand) Person Bunderstands lots about PageRank and concentrates heavily on it
Whatrsquos happening person B is doing the reverse of person A Whilst person Aconcentrated on the Non-PageRank factors and found herself gettingPageRank anyway person B concentrates on the PageRank Factor and findshimself getting Non-PageRank factors The reason for this is that increasingPageRank requires links and links have anchor text Thus through carefullychoosing the anchor text linking to his page person B automatically increaseshis Non-PageRank factor scores whilst obtaining his high PageRank score
Obviously these are two extremes but we can use these to extrapolate theadvantages and disadvantages of each approach
11
Advantages DisadvantagesPerson A bull Quick entry to the
results pagesbull Self-generating links
limit the amount ofwork needed
bull Securing the resultsto make it moredifficult forcompetitors tocompete takes morework
bull Slower to react tonew competitors
Person B bull Solid position Caneasily modify on thepage factors for aquick boost ifrequired
bull Probably highertraffic from non-search enginesources
bull Slower entry toresults pages
bull Difficult to do wellbull Increased likelihood
of tripping spamfilters
It is clear that both strategies can and do work Both strategies are usingPageRank as part the mix of factors that will ultimately improve their ranking inthe SERPS Because there is such a mix we can use them to different degreesdepending on the strategy that best suits your style My personal strategy is touse a combination but to save some of the ldquoon the pagerdquo factors for later in caseI need a quick boost if the competition heats up at a later time
Really heavy competition
No explanation of PageRank strategies would be complete without a finalstatement regarding heavy keyword competition There are some queries wherecompetition is so intense that you must do everything possible to maximize yourranking score (for example ldquoweb hostingrdquo) In such situations it is impossible torank highly through Non-PageRank factors alone (as you will not initially be listedhigh enough to be noticed and linked to) That is not to say that Non-PageRankfactors arent important Consider what your final rank score is
Final Rank Score = (score for all Non-PageRank factors) x (actual PageRankscore)
Improving either side of the equation can have a positive effect Howeverbecause the Non-PageRank factors have a restricted maximum benefit theactual PageRank score must be improved in order to compete successfullyUnder really heavy competition ndash it holds true that you cannot rank well unlessyour actual PageRank score is above a certain level In other words
12
There exists a query specific ldquoMinimum PageRank levelrdquo For queries that do nothave heavy competition this level is easy to achieve without even tryingHowever where heavy competition exists Non-PageRank factors are just asimportant (and easier to get) until they reach the Non-PageRank factor thresholdThis is why careful keyword choice can help you avoid the extensive workassociated with highly competitive search phrases
How PageRank is calculated
On a simple level we can tell quite a lot about how PageRank is calculated Thisis because when Google was just a university research project the creators ofPageRank published a paper that detailed a formula for calculating it Thisformula is now more than a handful of years old and we suspect that it haschanged somewhat since then However for detailing the over-riding principlesof how PageRank works it is as accurate today as the day it was written
PR(A) = (1-d) + d (PR(T1) C(T1) + + PR(Tn) C(Tn) )
Where PR(A) is the PageRank of Page A ( the one we want t owork out )
D is a dam pening factor Nom inally this is set to 085
PR(T1) is the PageRank of a site point ing to Page A
C(T1) is the num ber of links off that page
PR(Tn) C(Tn) m eans we do that for each page point ing toPage A
Source The Anatomy of a Large-Scale Hypertextual Web Search EngineSergey Brin and LawrencePage httpwww-dbstanfordedu~backrubgooglehtml
Couldnrsquot be simpler right Depending on your mathematical competence this iseither remarkably easy to understand or remarkably complex So herersquos the dealndash this isnrsquot a one-time calculation It looks pretty above but you canrsquot just do onesimple calculation and get an answer with the PageRank of a page If you look atit to calculate the PageRank of Page A we need to know the PageRank of allthe pages pointing to it To calculate the PageRank of those pages we need toknow the PageRank of all the pages pointing to them (one of which could and isquite likely to be Page A) If this seems like it could go around in circles forever
13
thatrsquos because it nearly does We have to do a whole lot of little calculationsmany times over before we actually get the answer What this formula tells us isthat however you slice it and however the formula may have changed since itwas written
The PageRank given to Page A by a Page B pointing to it is decreased witheach link to anywhere that exists on Page B This means that a pagersquos
PageRank is essentially a measure of its vote it can split that vote between onelink or two or many more but its overall voting power will always remain the
same
This really is important So to be totally clear letrsquos put some numbers to it (thesenumbers are made up for demonstration purposes only and have no relevancyto any particular page) Say Page B has a PageRank value of 5 and has a singlelink on it pointing to Page A Page Arsquos PageRank is improved by a proportion ofPage Brsquos value of 5 (Page B doesnrsquot lose anything but Page A gains) If Page Bhas two links that PageRank improvement would be split and Page A wouldonly gain half the PageRank that it did before
Now put the formula out of your mind for a moment as its easier to understandhow it works using a diagram Letrsquos say we have a hypothetical set of pagesimaginatively titled Page A Page B Page C and Page D They link to each otheras shown below
To begin with in our example at least we donrsquot know what the pagersquos startingPageRanks are Therersquos nothing special about the number that we pick to startwith (in fact if you read forward to the section on convergence ndash you can see that
14
we can start at any number we want) Since in the last version of this paper weperformed the calculations by setting these values to 1 ndash wersquore going to set themto zero this time around in order to prove that it doesnrsquot matter what the startingvalues are
Next we perform the necessary calculation to obtain the PageRank for eachpage The rules are
1 We take a 085 a pagersquos PageRank and divide it by the number oflinks on the page
2 We add that amount on to a new total for each page itrsquos being passedto
3 We add 015 to each of those totals
The first calculation is easy Because wersquove started at zero ndash 0 085 is always 0So each page gets just 015 + 0 Meaning each page now has a PageRank of015 Clearly wersquore not done ndash we want to show the importance of each pagebased upon links and theyrsquore all the same so we need to run the calculationagain
Page A links to pages B C and D Page Arsquos PageRank is 015 so it will add 085 015 = 01275 to the new PageRank scores of the pages it links to There arethree of them so they each get 00425
Page B links to page C Page Brsquos PageRank is 015 so it will add 085 015 =01275 to the new PageRank score of the pages it links to Since it only links topage C page C will get it all
Page C links to Page A all 01275 passes to page A
Page D links to Page C Again all 01275 passes to page C
The new totals for each page them become
Page A 015 (base) + 01275 (from Page C) = 02775Page B 015 (base) + 00425 (from Page A) = 01925Page C 015 (base) + 00425 (from Page A) + 01275 (from Page B) + 01275(from Page D) = 04475Page D 015 (base) + 00425 (from Page A) = 01925
So wersquove got
15
Pretty neat huh Already wersquore begining to see that Page C is probably the mostimportant page in the system (but we canrsquot be sure yet ndash it could well change)We carry on doing these calculations until the value for each page no longerchanges (this is called convergence ndash therersquos more about this in the nextsection) In practice Google probably doesnt wait for this convergence butinstead run a number of iterations of the calculation which is likely to give themfairly accurate values (more on this later as well) If we carried out all thecalculations for the example given it would take us 143 calculations [ ExcelExample 1] and wersquod reach final values of
16
As suspected Page C is the most important If we take a quick look at these rawvalues we can see something about the number of links pointing out from a pageLook at Page A which has a link from a high PageRank page (Page C) whichhas only one outbound link Then look at Page B and D both share links from ahigh PageRank page (Page A) with three outbound links The number of linkssignificantly alters the way PageRank is distributed
A word on convergence
Convergence is an important mathematical aspect of PageRank which allowsGoogle to provided unprecedented search quality at comparably low costs Thisis a semi-complex topic but it is important to your understanding of how and whyPageRank works Wersquove tried to make it as simple as possible but unless yoursquoreSergey Brin or Larry Page yoursquoll still need to concentrate
Whilst it will take some concentration it isnrsquot that hard to understand
Wersquove shown that the outlet values (final values) of one stage of the calculationbecome the inlet values (starting values) of the next stage and that we keep ondoing this (itrsquos known as a recursive procedure) But the really big question ishow and when does the recursive procedure stop
The answer is ldquoconvergencerdquo Provided the dampening factor (d in our equation)is less than one then convergence will occur Nominally we set it to 085(because thatrsquos the value mentioned in the Stanford papers)
This convergence basically means that whatever values we start at after runningthe calculation a number of times we will end up with the same final values andthat these values will no longer change if we do further iterations of thecalculation These final values are known as limiting values
Once the limiting values have been reached Google no longer needs to expendprocessing power on calculating the PageRank They can finish there
This is easier to understand with an example Letrsquos take a look at the followingstructure
17
PageRank for pages A B C D at various stages of iteration
A B C DIteration 1 1 1 1
1 10000000000 05750000000 22750000000 015000000002 20837500000 05750000000 11912500000 015000000003 11625625000 10355937500 16518437500 01500000000
19 14900124031 07833688246 15766187723 0150000000020 14901259564 07832552713 15766187723 0150000000021 14901259564 07833035315 15765705121 0150000000046 14901074052 07832956473 15765969475 0150000000047 14901074054 07832956472 15765969474 0150000000048 14901074053 07832956473 15765969474 0150000000049 14901074053 07832956473 15765969474 0150000000050 14901074053 07832956473 15765969474 0150000000051 14901074053 07832956473 15765969474 0150000000052 14901074053 07832956473 15765969474 0150000000053 14901074053 07832956473 15765969474 0150000000054 14901074053 07832956473 15765969474 01500000000
No matter how much we try to calculate the value past the 48th iteration thevalues do not change ndash they have converged at the limiting values
In practice theres no need to wait until they do not change at all before we stopcalculations We simply wait until they do not change very much In our exampleabove we have done that to 10 decimal places Ie we have calculated until theydo not change by 00000000001 or more (or mathematically | PR 48 (A) - PR 50(A) | lt 10 ndash10)
18
Derived specifics of how Google calculates PageRank
We started our values from zero and now see that the calculations are fairlycomplicated and resource intensive So two questions arise When Googlecalculates PageRank themselves do they start each month with an arbitraryvalue And secondly do they calculate PageRank across several machines Wecan answer these questions by a couple of simple experiments Assume thathaving calculated the PageRanks in the structure shown earlier we would like toadd a page and change a link to make the structure look like this
If we start each page from an arbitrary starting value of one then how manyiterations of the calculation does it take to achieve convergence [ ExcelExample 2]
75
Now if we start each page from the values we ended up with when we calculatedthe PageRank of the original structure (and set Page Ersquos starting value to 1) thenhow many iterations does it take to achieve convergence [ Excel Example 3]
78
These numbers are really very close Therefore it would appear logical thatGoogle does not reset the values with each calculation as they may simply carryon from where they left off
So do they calculate across several machines or one Obviously there is toomuch work for one machine when you get to billions of pages so several
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
6
which PageRank is only one Wersquoll get into more details later by discussing howPageRank is different than the other ranking factors and thus when it appliesand when it doesnrsquot Ironically enough PageRankrsquos weighting factor isundeniably declining Since the original version of this paper gave out detailedinformation about PageRank in all likelihood it may also have contributed insome small way to the decline in weighting of the very subject it talks about
Is PageRank a good determination of the quality of a page
To examine the worth of PageRank we need to first look at its premise and howaccurate it is Basically PageRank says
1 If a page links to another page it is casting a vote which indicates that theother page is good
2 If lots of pages link to a page then it has more votes and its worth shouldbe higher
The basic implication here is People only link to pages they think are good
It shouldnt be hard to convince you that this premise is wrong A few of thereasons people link to pages other than ones they think are good are
1) Reciprocal links ndash ldquoLink to me and Irsquoll link to yourdquo
2) Link Requirements ndash ldquoUsing our script requires you to put a link to ourpagerdquo or ldquoWersquoll give you an award solely because you link to our pagerdquo
3) Friends and Family ndash ldquoThis is my friend Petersquos siterdquo or ldquoMy mumrsquos siteis here my dadrsquos site is here My dogrsquos site is hererdquo
4) Free Page Add-ons ndash ldquoThis counter was provided bywwwlinktocountersitecomrdquo
Furthermore anybody who has a top-ranking site will tell you that it tends to getlinks from new sites This is not necessarily because its good (although theygenerally are) Assume a Webmaster is setting up a new site and they arelooking for some outbound links Nowadays one of the first things they do is aGoogle search for similar sites The links they end up with may not necessarilybe the best sites but merely the easiest ones to find If PageRank influencesrankings and if they subsequently link to those pages ndash the new Webmaster willbe adding to the inaccuracies in the judging of the quality of a page The same istrue when these new Webmasters use the Google Toolbar PageRank indicator tochoose whom to link to
7
To put this another way
PageRank is determined by the links pointing to a page But if PageRank itselfhas an influence on the number of links to a page it is influencing (in acircular way) the quality of that page The links are no longer based solely onhuman judgement If a Webmaster picks their outbound links by searching onGoogle or by looking at the Google toolbar (even if this is only part of theirjudgement) then there is a corresponding increase in a pagersquos PageRankThis increase is not solely because it is a good page but because itsPageRank is already high
Googlersquos First 1000 Results
Remember PageRank alone cannot get you high rankings Wersquove mentionedbefore that PageRank is a multiplier so if your score for all other factors is 0 andyour PageRank is twenty billion then you still score 0 (last in the results) This isnot to say PageRank is worthless but there is some confusion over whenPageRank is useful and when it is not This leads to many misinterpretations ofits worth The only way to clear up these misinterpretations is to point out whenPageRank is not worthwhile
If you perform any broad search on Google it will appear as if youve foundseveral thousand results However you can only view the first 1000 of themUnderstanding why this is so explains why you should always concentrate onldquoon the pagerdquo factors and anchor text first and PageRank last
Assume that you perform a search on Google and it returns 200000 results Ifwe were to calculate every factor for each 200000 pages ndash do you think it wouldreally take just 034 seconds to search The answer to speeding up the search isto get a subset of documents that are most likely to be related to the query Thissubset of documents needs to be larger than the number of search results Forexample letrsquos say that number is 2000 What the search engine does is querythe whole database using 2 or 3 factors finding the 2000 documents that rankhighest for them (Remember there were 200000 possible documents andthatrsquos the number that actually gets shown) Then the engine applies all thefactors to those 2000 and ranks them accordingly Because therersquos a drop in thequality of the results (not the pages) at the bottom of this subset the engine justshows the first 1000 PageRank is almost certainly not one of those factorsNotice how before we highlighted the word ldquorelatedrdquo in creating the subset of2000 pages The search engine is looking for pages that are on-topic If weincluded PageRank in that list wersquod get a lot of high PageRank pages with topicsthat are only slightly related (because of the second factor) but thatrsquos not whatwe want
8
Why this is critical
You must do enough on the page work andor anchor text work to get intothat subset of 2000 pages for your chosen key phrase otherwise your highPageRank will be completely in vain PageRank means nothing if you do nothave enough ranking from other factors to make it into the first subset
The difference between PageRank and other factors
To assess when PageRank is important and when it is not we need tounderstand how PageRank is different from all other ranking factors To do thisherersquos a quick table that lists a few other factors and how they add to the rankingscore
Title Tag Can only be listed onceKeywords in Body Text Each successive repetition is
less important Proximity isimportant
Anchor Text Highly weighted but likekeywords in body text there isa cut off point where furtheranchor text is no longerworthwhile
PageRank Potentially infinite You arealways capable of increasingyour PageRank significantlybut it takes work
All other ranking factors have cut off points beyond which they will no longeradd to your ranking score or will not add significantly enough for it to beworthwhile PageRank has no cut off point
Non-PageRank Factor Threshold
Having written about the difference between PageRank and other factors andhow PageRank is harder to get it should now be clear that whilst we could usemany methods to get good rankings there is a threshold which defines whenhigh PageRank is worth striving for and when it is not
With ranking factors other than PageRank there is a score beyond which theslow down in the rate that any factor adds to this score is so insignificant that it is
9
not worthwhile This is the Non-PageRank Factor Threshold To illustrate thisletrsquos put an example figure on this of 1000
If we have a query where the results are Page A and Page B then Page A and Bhave scores for that query which are the total scores for all ranking factors(including PageRank) Letrsquos say Page Arsquos score is 900 and page Brsquos score is500 Obviously Page A will be listed first These are both below our hypotheticalNon-PageRank Factor Threshold thus without any change in PageRank it ispossible for page B to improve their optimization to beat Page A for this particularquery There are lots of queries like this on Google theyre more commonlythought of as less competitive queries
Now assume Page A raises its score to 1100 Suddenly page B cannot competein the SERPs (search engine results pages) without increasing its PageRank Inall probability page B must also improve for all the other ranking factors but anincrease in PageRank is almost certainly necessary There are also lots ofqueries like this on Google which are more commonly thought of as morecompetitive queries
Generally when querying Google the group of pages in the SERPs will containsome pages that have a score above the Non-PageRank Factor Threshold andsome that do not
There is an important point to be made here
To be competitive you must raise your pages search engine ranking scorebeyond the Non-PageRank Factor Threshold To fail to do so means that youcan easily be beaten in the search results for your query terms The quickestway to approach the Non-PageRank Factor Threshold is through ldquoon the pagefactorsrdquo however you cannot move above the Non-PageRank FactorThreshold without PageRank
The obvious question is whatrsquos the numerical value of the Non-PageRank FactorThreshold and how much work do you need to do to get past it The answer isthat it has no value it is a hypothetical line Google could put a value on it butthat would not help us unless we know what the pagersquos individual scores are Weneed only be aware that the threshold exists and that it gives us informationabout principles
Using the threshold to derive the worth of two ranking strategies
The threshold explains principles and the different ways that search enginemarketers work It also demonstrates why some of the misunderstandings aboutPageRank occur Letrsquos consider the strategies of two people Person A considersPageRank to be unimportant and Person B considers PageRank to be veryimportant
10
Person A says ldquoPageRankrdquo is unimportant They have optimised pages foryears and know how to use ldquoon the pagerdquo factors very successfully Theyunderstand the basics of anchor text but they couldnrsquot care at all aboutPageRank
Whatrsquos happening person A is reaching the Non-PageRank Factor Thresholdvery quickly because they are maximising the ldquoon the pagerdquo factors Throughcarefully choosing keywords they jump-start themselves up the SERPs Aslong as their content is good high-ranking sites (over time) tend to get linkedto Whilst they didnrsquot directly ask for it a slow trickle of sites will begin to link tothem and give them PageRank which helps consolidates their position
Person B says ldquoPageRankrdquo is important Wersquove all seen those pages in theresults that have no content but great rankings (With big brands this canoften occur naturally even when they have no idea what PageRank is Thiswould be Person C who is not relevant to the discussion at hand) Person Bunderstands lots about PageRank and concentrates heavily on it
Whatrsquos happening person B is doing the reverse of person A Whilst person Aconcentrated on the Non-PageRank factors and found herself gettingPageRank anyway person B concentrates on the PageRank Factor and findshimself getting Non-PageRank factors The reason for this is that increasingPageRank requires links and links have anchor text Thus through carefullychoosing the anchor text linking to his page person B automatically increaseshis Non-PageRank factor scores whilst obtaining his high PageRank score
Obviously these are two extremes but we can use these to extrapolate theadvantages and disadvantages of each approach
11
Advantages DisadvantagesPerson A bull Quick entry to the
results pagesbull Self-generating links
limit the amount ofwork needed
bull Securing the resultsto make it moredifficult forcompetitors tocompete takes morework
bull Slower to react tonew competitors
Person B bull Solid position Caneasily modify on thepage factors for aquick boost ifrequired
bull Probably highertraffic from non-search enginesources
bull Slower entry toresults pages
bull Difficult to do wellbull Increased likelihood
of tripping spamfilters
It is clear that both strategies can and do work Both strategies are usingPageRank as part the mix of factors that will ultimately improve their ranking inthe SERPS Because there is such a mix we can use them to different degreesdepending on the strategy that best suits your style My personal strategy is touse a combination but to save some of the ldquoon the pagerdquo factors for later in caseI need a quick boost if the competition heats up at a later time
Really heavy competition
No explanation of PageRank strategies would be complete without a finalstatement regarding heavy keyword competition There are some queries wherecompetition is so intense that you must do everything possible to maximize yourranking score (for example ldquoweb hostingrdquo) In such situations it is impossible torank highly through Non-PageRank factors alone (as you will not initially be listedhigh enough to be noticed and linked to) That is not to say that Non-PageRankfactors arent important Consider what your final rank score is
Final Rank Score = (score for all Non-PageRank factors) x (actual PageRankscore)
Improving either side of the equation can have a positive effect Howeverbecause the Non-PageRank factors have a restricted maximum benefit theactual PageRank score must be improved in order to compete successfullyUnder really heavy competition ndash it holds true that you cannot rank well unlessyour actual PageRank score is above a certain level In other words
12
There exists a query specific ldquoMinimum PageRank levelrdquo For queries that do nothave heavy competition this level is easy to achieve without even tryingHowever where heavy competition exists Non-PageRank factors are just asimportant (and easier to get) until they reach the Non-PageRank factor thresholdThis is why careful keyword choice can help you avoid the extensive workassociated with highly competitive search phrases
How PageRank is calculated
On a simple level we can tell quite a lot about how PageRank is calculated Thisis because when Google was just a university research project the creators ofPageRank published a paper that detailed a formula for calculating it Thisformula is now more than a handful of years old and we suspect that it haschanged somewhat since then However for detailing the over-riding principlesof how PageRank works it is as accurate today as the day it was written
PR(A) = (1-d) + d (PR(T1) C(T1) + + PR(Tn) C(Tn) )
Where PR(A) is the PageRank of Page A ( the one we want t owork out )
D is a dam pening factor Nom inally this is set to 085
PR(T1) is the PageRank of a site point ing to Page A
C(T1) is the num ber of links off that page
PR(Tn) C(Tn) m eans we do that for each page point ing toPage A
Source The Anatomy of a Large-Scale Hypertextual Web Search EngineSergey Brin and LawrencePage httpwww-dbstanfordedu~backrubgooglehtml
Couldnrsquot be simpler right Depending on your mathematical competence this iseither remarkably easy to understand or remarkably complex So herersquos the dealndash this isnrsquot a one-time calculation It looks pretty above but you canrsquot just do onesimple calculation and get an answer with the PageRank of a page If you look atit to calculate the PageRank of Page A we need to know the PageRank of allthe pages pointing to it To calculate the PageRank of those pages we need toknow the PageRank of all the pages pointing to them (one of which could and isquite likely to be Page A) If this seems like it could go around in circles forever
13
thatrsquos because it nearly does We have to do a whole lot of little calculationsmany times over before we actually get the answer What this formula tells us isthat however you slice it and however the formula may have changed since itwas written
The PageRank given to Page A by a Page B pointing to it is decreased witheach link to anywhere that exists on Page B This means that a pagersquos
PageRank is essentially a measure of its vote it can split that vote between onelink or two or many more but its overall voting power will always remain the
same
This really is important So to be totally clear letrsquos put some numbers to it (thesenumbers are made up for demonstration purposes only and have no relevancyto any particular page) Say Page B has a PageRank value of 5 and has a singlelink on it pointing to Page A Page Arsquos PageRank is improved by a proportion ofPage Brsquos value of 5 (Page B doesnrsquot lose anything but Page A gains) If Page Bhas two links that PageRank improvement would be split and Page A wouldonly gain half the PageRank that it did before
Now put the formula out of your mind for a moment as its easier to understandhow it works using a diagram Letrsquos say we have a hypothetical set of pagesimaginatively titled Page A Page B Page C and Page D They link to each otheras shown below
To begin with in our example at least we donrsquot know what the pagersquos startingPageRanks are Therersquos nothing special about the number that we pick to startwith (in fact if you read forward to the section on convergence ndash you can see that
14
we can start at any number we want) Since in the last version of this paper weperformed the calculations by setting these values to 1 ndash wersquore going to set themto zero this time around in order to prove that it doesnrsquot matter what the startingvalues are
Next we perform the necessary calculation to obtain the PageRank for eachpage The rules are
1 We take a 085 a pagersquos PageRank and divide it by the number oflinks on the page
2 We add that amount on to a new total for each page itrsquos being passedto
3 We add 015 to each of those totals
The first calculation is easy Because wersquove started at zero ndash 0 085 is always 0So each page gets just 015 + 0 Meaning each page now has a PageRank of015 Clearly wersquore not done ndash we want to show the importance of each pagebased upon links and theyrsquore all the same so we need to run the calculationagain
Page A links to pages B C and D Page Arsquos PageRank is 015 so it will add 085 015 = 01275 to the new PageRank scores of the pages it links to There arethree of them so they each get 00425
Page B links to page C Page Brsquos PageRank is 015 so it will add 085 015 =01275 to the new PageRank score of the pages it links to Since it only links topage C page C will get it all
Page C links to Page A all 01275 passes to page A
Page D links to Page C Again all 01275 passes to page C
The new totals for each page them become
Page A 015 (base) + 01275 (from Page C) = 02775Page B 015 (base) + 00425 (from Page A) = 01925Page C 015 (base) + 00425 (from Page A) + 01275 (from Page B) + 01275(from Page D) = 04475Page D 015 (base) + 00425 (from Page A) = 01925
So wersquove got
15
Pretty neat huh Already wersquore begining to see that Page C is probably the mostimportant page in the system (but we canrsquot be sure yet ndash it could well change)We carry on doing these calculations until the value for each page no longerchanges (this is called convergence ndash therersquos more about this in the nextsection) In practice Google probably doesnt wait for this convergence butinstead run a number of iterations of the calculation which is likely to give themfairly accurate values (more on this later as well) If we carried out all thecalculations for the example given it would take us 143 calculations [ ExcelExample 1] and wersquod reach final values of
16
As suspected Page C is the most important If we take a quick look at these rawvalues we can see something about the number of links pointing out from a pageLook at Page A which has a link from a high PageRank page (Page C) whichhas only one outbound link Then look at Page B and D both share links from ahigh PageRank page (Page A) with three outbound links The number of linkssignificantly alters the way PageRank is distributed
A word on convergence
Convergence is an important mathematical aspect of PageRank which allowsGoogle to provided unprecedented search quality at comparably low costs Thisis a semi-complex topic but it is important to your understanding of how and whyPageRank works Wersquove tried to make it as simple as possible but unless yoursquoreSergey Brin or Larry Page yoursquoll still need to concentrate
Whilst it will take some concentration it isnrsquot that hard to understand
Wersquove shown that the outlet values (final values) of one stage of the calculationbecome the inlet values (starting values) of the next stage and that we keep ondoing this (itrsquos known as a recursive procedure) But the really big question ishow and when does the recursive procedure stop
The answer is ldquoconvergencerdquo Provided the dampening factor (d in our equation)is less than one then convergence will occur Nominally we set it to 085(because thatrsquos the value mentioned in the Stanford papers)
This convergence basically means that whatever values we start at after runningthe calculation a number of times we will end up with the same final values andthat these values will no longer change if we do further iterations of thecalculation These final values are known as limiting values
Once the limiting values have been reached Google no longer needs to expendprocessing power on calculating the PageRank They can finish there
This is easier to understand with an example Letrsquos take a look at the followingstructure
17
PageRank for pages A B C D at various stages of iteration
A B C DIteration 1 1 1 1
1 10000000000 05750000000 22750000000 015000000002 20837500000 05750000000 11912500000 015000000003 11625625000 10355937500 16518437500 01500000000
19 14900124031 07833688246 15766187723 0150000000020 14901259564 07832552713 15766187723 0150000000021 14901259564 07833035315 15765705121 0150000000046 14901074052 07832956473 15765969475 0150000000047 14901074054 07832956472 15765969474 0150000000048 14901074053 07832956473 15765969474 0150000000049 14901074053 07832956473 15765969474 0150000000050 14901074053 07832956473 15765969474 0150000000051 14901074053 07832956473 15765969474 0150000000052 14901074053 07832956473 15765969474 0150000000053 14901074053 07832956473 15765969474 0150000000054 14901074053 07832956473 15765969474 01500000000
No matter how much we try to calculate the value past the 48th iteration thevalues do not change ndash they have converged at the limiting values
In practice theres no need to wait until they do not change at all before we stopcalculations We simply wait until they do not change very much In our exampleabove we have done that to 10 decimal places Ie we have calculated until theydo not change by 00000000001 or more (or mathematically | PR 48 (A) - PR 50(A) | lt 10 ndash10)
18
Derived specifics of how Google calculates PageRank
We started our values from zero and now see that the calculations are fairlycomplicated and resource intensive So two questions arise When Googlecalculates PageRank themselves do they start each month with an arbitraryvalue And secondly do they calculate PageRank across several machines Wecan answer these questions by a couple of simple experiments Assume thathaving calculated the PageRanks in the structure shown earlier we would like toadd a page and change a link to make the structure look like this
If we start each page from an arbitrary starting value of one then how manyiterations of the calculation does it take to achieve convergence [ ExcelExample 2]
75
Now if we start each page from the values we ended up with when we calculatedthe PageRank of the original structure (and set Page Ersquos starting value to 1) thenhow many iterations does it take to achieve convergence [ Excel Example 3]
78
These numbers are really very close Therefore it would appear logical thatGoogle does not reset the values with each calculation as they may simply carryon from where they left off
So do they calculate across several machines or one Obviously there is toomuch work for one machine when you get to billions of pages so several
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
7
To put this another way
PageRank is determined by the links pointing to a page But if PageRank itselfhas an influence on the number of links to a page it is influencing (in acircular way) the quality of that page The links are no longer based solely onhuman judgement If a Webmaster picks their outbound links by searching onGoogle or by looking at the Google toolbar (even if this is only part of theirjudgement) then there is a corresponding increase in a pagersquos PageRankThis increase is not solely because it is a good page but because itsPageRank is already high
Googlersquos First 1000 Results
Remember PageRank alone cannot get you high rankings Wersquove mentionedbefore that PageRank is a multiplier so if your score for all other factors is 0 andyour PageRank is twenty billion then you still score 0 (last in the results) This isnot to say PageRank is worthless but there is some confusion over whenPageRank is useful and when it is not This leads to many misinterpretations ofits worth The only way to clear up these misinterpretations is to point out whenPageRank is not worthwhile
If you perform any broad search on Google it will appear as if youve foundseveral thousand results However you can only view the first 1000 of themUnderstanding why this is so explains why you should always concentrate onldquoon the pagerdquo factors and anchor text first and PageRank last
Assume that you perform a search on Google and it returns 200000 results Ifwe were to calculate every factor for each 200000 pages ndash do you think it wouldreally take just 034 seconds to search The answer to speeding up the search isto get a subset of documents that are most likely to be related to the query Thissubset of documents needs to be larger than the number of search results Forexample letrsquos say that number is 2000 What the search engine does is querythe whole database using 2 or 3 factors finding the 2000 documents that rankhighest for them (Remember there were 200000 possible documents andthatrsquos the number that actually gets shown) Then the engine applies all thefactors to those 2000 and ranks them accordingly Because therersquos a drop in thequality of the results (not the pages) at the bottom of this subset the engine justshows the first 1000 PageRank is almost certainly not one of those factorsNotice how before we highlighted the word ldquorelatedrdquo in creating the subset of2000 pages The search engine is looking for pages that are on-topic If weincluded PageRank in that list wersquod get a lot of high PageRank pages with topicsthat are only slightly related (because of the second factor) but thatrsquos not whatwe want
8
Why this is critical
You must do enough on the page work andor anchor text work to get intothat subset of 2000 pages for your chosen key phrase otherwise your highPageRank will be completely in vain PageRank means nothing if you do nothave enough ranking from other factors to make it into the first subset
The difference between PageRank and other factors
To assess when PageRank is important and when it is not we need tounderstand how PageRank is different from all other ranking factors To do thisherersquos a quick table that lists a few other factors and how they add to the rankingscore
Title Tag Can only be listed onceKeywords in Body Text Each successive repetition is
less important Proximity isimportant
Anchor Text Highly weighted but likekeywords in body text there isa cut off point where furtheranchor text is no longerworthwhile
PageRank Potentially infinite You arealways capable of increasingyour PageRank significantlybut it takes work
All other ranking factors have cut off points beyond which they will no longeradd to your ranking score or will not add significantly enough for it to beworthwhile PageRank has no cut off point
Non-PageRank Factor Threshold
Having written about the difference between PageRank and other factors andhow PageRank is harder to get it should now be clear that whilst we could usemany methods to get good rankings there is a threshold which defines whenhigh PageRank is worth striving for and when it is not
With ranking factors other than PageRank there is a score beyond which theslow down in the rate that any factor adds to this score is so insignificant that it is
9
not worthwhile This is the Non-PageRank Factor Threshold To illustrate thisletrsquos put an example figure on this of 1000
If we have a query where the results are Page A and Page B then Page A and Bhave scores for that query which are the total scores for all ranking factors(including PageRank) Letrsquos say Page Arsquos score is 900 and page Brsquos score is500 Obviously Page A will be listed first These are both below our hypotheticalNon-PageRank Factor Threshold thus without any change in PageRank it ispossible for page B to improve their optimization to beat Page A for this particularquery There are lots of queries like this on Google theyre more commonlythought of as less competitive queries
Now assume Page A raises its score to 1100 Suddenly page B cannot competein the SERPs (search engine results pages) without increasing its PageRank Inall probability page B must also improve for all the other ranking factors but anincrease in PageRank is almost certainly necessary There are also lots ofqueries like this on Google which are more commonly thought of as morecompetitive queries
Generally when querying Google the group of pages in the SERPs will containsome pages that have a score above the Non-PageRank Factor Threshold andsome that do not
There is an important point to be made here
To be competitive you must raise your pages search engine ranking scorebeyond the Non-PageRank Factor Threshold To fail to do so means that youcan easily be beaten in the search results for your query terms The quickestway to approach the Non-PageRank Factor Threshold is through ldquoon the pagefactorsrdquo however you cannot move above the Non-PageRank FactorThreshold without PageRank
The obvious question is whatrsquos the numerical value of the Non-PageRank FactorThreshold and how much work do you need to do to get past it The answer isthat it has no value it is a hypothetical line Google could put a value on it butthat would not help us unless we know what the pagersquos individual scores are Weneed only be aware that the threshold exists and that it gives us informationabout principles
Using the threshold to derive the worth of two ranking strategies
The threshold explains principles and the different ways that search enginemarketers work It also demonstrates why some of the misunderstandings aboutPageRank occur Letrsquos consider the strategies of two people Person A considersPageRank to be unimportant and Person B considers PageRank to be veryimportant
10
Person A says ldquoPageRankrdquo is unimportant They have optimised pages foryears and know how to use ldquoon the pagerdquo factors very successfully Theyunderstand the basics of anchor text but they couldnrsquot care at all aboutPageRank
Whatrsquos happening person A is reaching the Non-PageRank Factor Thresholdvery quickly because they are maximising the ldquoon the pagerdquo factors Throughcarefully choosing keywords they jump-start themselves up the SERPs Aslong as their content is good high-ranking sites (over time) tend to get linkedto Whilst they didnrsquot directly ask for it a slow trickle of sites will begin to link tothem and give them PageRank which helps consolidates their position
Person B says ldquoPageRankrdquo is important Wersquove all seen those pages in theresults that have no content but great rankings (With big brands this canoften occur naturally even when they have no idea what PageRank is Thiswould be Person C who is not relevant to the discussion at hand) Person Bunderstands lots about PageRank and concentrates heavily on it
Whatrsquos happening person B is doing the reverse of person A Whilst person Aconcentrated on the Non-PageRank factors and found herself gettingPageRank anyway person B concentrates on the PageRank Factor and findshimself getting Non-PageRank factors The reason for this is that increasingPageRank requires links and links have anchor text Thus through carefullychoosing the anchor text linking to his page person B automatically increaseshis Non-PageRank factor scores whilst obtaining his high PageRank score
Obviously these are two extremes but we can use these to extrapolate theadvantages and disadvantages of each approach
11
Advantages DisadvantagesPerson A bull Quick entry to the
results pagesbull Self-generating links
limit the amount ofwork needed
bull Securing the resultsto make it moredifficult forcompetitors tocompete takes morework
bull Slower to react tonew competitors
Person B bull Solid position Caneasily modify on thepage factors for aquick boost ifrequired
bull Probably highertraffic from non-search enginesources
bull Slower entry toresults pages
bull Difficult to do wellbull Increased likelihood
of tripping spamfilters
It is clear that both strategies can and do work Both strategies are usingPageRank as part the mix of factors that will ultimately improve their ranking inthe SERPS Because there is such a mix we can use them to different degreesdepending on the strategy that best suits your style My personal strategy is touse a combination but to save some of the ldquoon the pagerdquo factors for later in caseI need a quick boost if the competition heats up at a later time
Really heavy competition
No explanation of PageRank strategies would be complete without a finalstatement regarding heavy keyword competition There are some queries wherecompetition is so intense that you must do everything possible to maximize yourranking score (for example ldquoweb hostingrdquo) In such situations it is impossible torank highly through Non-PageRank factors alone (as you will not initially be listedhigh enough to be noticed and linked to) That is not to say that Non-PageRankfactors arent important Consider what your final rank score is
Final Rank Score = (score for all Non-PageRank factors) x (actual PageRankscore)
Improving either side of the equation can have a positive effect Howeverbecause the Non-PageRank factors have a restricted maximum benefit theactual PageRank score must be improved in order to compete successfullyUnder really heavy competition ndash it holds true that you cannot rank well unlessyour actual PageRank score is above a certain level In other words
12
There exists a query specific ldquoMinimum PageRank levelrdquo For queries that do nothave heavy competition this level is easy to achieve without even tryingHowever where heavy competition exists Non-PageRank factors are just asimportant (and easier to get) until they reach the Non-PageRank factor thresholdThis is why careful keyword choice can help you avoid the extensive workassociated with highly competitive search phrases
How PageRank is calculated
On a simple level we can tell quite a lot about how PageRank is calculated Thisis because when Google was just a university research project the creators ofPageRank published a paper that detailed a formula for calculating it Thisformula is now more than a handful of years old and we suspect that it haschanged somewhat since then However for detailing the over-riding principlesof how PageRank works it is as accurate today as the day it was written
PR(A) = (1-d) + d (PR(T1) C(T1) + + PR(Tn) C(Tn) )
Where PR(A) is the PageRank of Page A ( the one we want t owork out )
D is a dam pening factor Nom inally this is set to 085
PR(T1) is the PageRank of a site point ing to Page A
C(T1) is the num ber of links off that page
PR(Tn) C(Tn) m eans we do that for each page point ing toPage A
Source The Anatomy of a Large-Scale Hypertextual Web Search EngineSergey Brin and LawrencePage httpwww-dbstanfordedu~backrubgooglehtml
Couldnrsquot be simpler right Depending on your mathematical competence this iseither remarkably easy to understand or remarkably complex So herersquos the dealndash this isnrsquot a one-time calculation It looks pretty above but you canrsquot just do onesimple calculation and get an answer with the PageRank of a page If you look atit to calculate the PageRank of Page A we need to know the PageRank of allthe pages pointing to it To calculate the PageRank of those pages we need toknow the PageRank of all the pages pointing to them (one of which could and isquite likely to be Page A) If this seems like it could go around in circles forever
13
thatrsquos because it nearly does We have to do a whole lot of little calculationsmany times over before we actually get the answer What this formula tells us isthat however you slice it and however the formula may have changed since itwas written
The PageRank given to Page A by a Page B pointing to it is decreased witheach link to anywhere that exists on Page B This means that a pagersquos
PageRank is essentially a measure of its vote it can split that vote between onelink or two or many more but its overall voting power will always remain the
same
This really is important So to be totally clear letrsquos put some numbers to it (thesenumbers are made up for demonstration purposes only and have no relevancyto any particular page) Say Page B has a PageRank value of 5 and has a singlelink on it pointing to Page A Page Arsquos PageRank is improved by a proportion ofPage Brsquos value of 5 (Page B doesnrsquot lose anything but Page A gains) If Page Bhas two links that PageRank improvement would be split and Page A wouldonly gain half the PageRank that it did before
Now put the formula out of your mind for a moment as its easier to understandhow it works using a diagram Letrsquos say we have a hypothetical set of pagesimaginatively titled Page A Page B Page C and Page D They link to each otheras shown below
To begin with in our example at least we donrsquot know what the pagersquos startingPageRanks are Therersquos nothing special about the number that we pick to startwith (in fact if you read forward to the section on convergence ndash you can see that
14
we can start at any number we want) Since in the last version of this paper weperformed the calculations by setting these values to 1 ndash wersquore going to set themto zero this time around in order to prove that it doesnrsquot matter what the startingvalues are
Next we perform the necessary calculation to obtain the PageRank for eachpage The rules are
1 We take a 085 a pagersquos PageRank and divide it by the number oflinks on the page
2 We add that amount on to a new total for each page itrsquos being passedto
3 We add 015 to each of those totals
The first calculation is easy Because wersquove started at zero ndash 0 085 is always 0So each page gets just 015 + 0 Meaning each page now has a PageRank of015 Clearly wersquore not done ndash we want to show the importance of each pagebased upon links and theyrsquore all the same so we need to run the calculationagain
Page A links to pages B C and D Page Arsquos PageRank is 015 so it will add 085 015 = 01275 to the new PageRank scores of the pages it links to There arethree of them so they each get 00425
Page B links to page C Page Brsquos PageRank is 015 so it will add 085 015 =01275 to the new PageRank score of the pages it links to Since it only links topage C page C will get it all
Page C links to Page A all 01275 passes to page A
Page D links to Page C Again all 01275 passes to page C
The new totals for each page them become
Page A 015 (base) + 01275 (from Page C) = 02775Page B 015 (base) + 00425 (from Page A) = 01925Page C 015 (base) + 00425 (from Page A) + 01275 (from Page B) + 01275(from Page D) = 04475Page D 015 (base) + 00425 (from Page A) = 01925
So wersquove got
15
Pretty neat huh Already wersquore begining to see that Page C is probably the mostimportant page in the system (but we canrsquot be sure yet ndash it could well change)We carry on doing these calculations until the value for each page no longerchanges (this is called convergence ndash therersquos more about this in the nextsection) In practice Google probably doesnt wait for this convergence butinstead run a number of iterations of the calculation which is likely to give themfairly accurate values (more on this later as well) If we carried out all thecalculations for the example given it would take us 143 calculations [ ExcelExample 1] and wersquod reach final values of
16
As suspected Page C is the most important If we take a quick look at these rawvalues we can see something about the number of links pointing out from a pageLook at Page A which has a link from a high PageRank page (Page C) whichhas only one outbound link Then look at Page B and D both share links from ahigh PageRank page (Page A) with three outbound links The number of linkssignificantly alters the way PageRank is distributed
A word on convergence
Convergence is an important mathematical aspect of PageRank which allowsGoogle to provided unprecedented search quality at comparably low costs Thisis a semi-complex topic but it is important to your understanding of how and whyPageRank works Wersquove tried to make it as simple as possible but unless yoursquoreSergey Brin or Larry Page yoursquoll still need to concentrate
Whilst it will take some concentration it isnrsquot that hard to understand
Wersquove shown that the outlet values (final values) of one stage of the calculationbecome the inlet values (starting values) of the next stage and that we keep ondoing this (itrsquos known as a recursive procedure) But the really big question ishow and when does the recursive procedure stop
The answer is ldquoconvergencerdquo Provided the dampening factor (d in our equation)is less than one then convergence will occur Nominally we set it to 085(because thatrsquos the value mentioned in the Stanford papers)
This convergence basically means that whatever values we start at after runningthe calculation a number of times we will end up with the same final values andthat these values will no longer change if we do further iterations of thecalculation These final values are known as limiting values
Once the limiting values have been reached Google no longer needs to expendprocessing power on calculating the PageRank They can finish there
This is easier to understand with an example Letrsquos take a look at the followingstructure
17
PageRank for pages A B C D at various stages of iteration
A B C DIteration 1 1 1 1
1 10000000000 05750000000 22750000000 015000000002 20837500000 05750000000 11912500000 015000000003 11625625000 10355937500 16518437500 01500000000
19 14900124031 07833688246 15766187723 0150000000020 14901259564 07832552713 15766187723 0150000000021 14901259564 07833035315 15765705121 0150000000046 14901074052 07832956473 15765969475 0150000000047 14901074054 07832956472 15765969474 0150000000048 14901074053 07832956473 15765969474 0150000000049 14901074053 07832956473 15765969474 0150000000050 14901074053 07832956473 15765969474 0150000000051 14901074053 07832956473 15765969474 0150000000052 14901074053 07832956473 15765969474 0150000000053 14901074053 07832956473 15765969474 0150000000054 14901074053 07832956473 15765969474 01500000000
No matter how much we try to calculate the value past the 48th iteration thevalues do not change ndash they have converged at the limiting values
In practice theres no need to wait until they do not change at all before we stopcalculations We simply wait until they do not change very much In our exampleabove we have done that to 10 decimal places Ie we have calculated until theydo not change by 00000000001 or more (or mathematically | PR 48 (A) - PR 50(A) | lt 10 ndash10)
18
Derived specifics of how Google calculates PageRank
We started our values from zero and now see that the calculations are fairlycomplicated and resource intensive So two questions arise When Googlecalculates PageRank themselves do they start each month with an arbitraryvalue And secondly do they calculate PageRank across several machines Wecan answer these questions by a couple of simple experiments Assume thathaving calculated the PageRanks in the structure shown earlier we would like toadd a page and change a link to make the structure look like this
If we start each page from an arbitrary starting value of one then how manyiterations of the calculation does it take to achieve convergence [ ExcelExample 2]
75
Now if we start each page from the values we ended up with when we calculatedthe PageRank of the original structure (and set Page Ersquos starting value to 1) thenhow many iterations does it take to achieve convergence [ Excel Example 3]
78
These numbers are really very close Therefore it would appear logical thatGoogle does not reset the values with each calculation as they may simply carryon from where they left off
So do they calculate across several machines or one Obviously there is toomuch work for one machine when you get to billions of pages so several
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
8
Why this is critical
You must do enough on the page work andor anchor text work to get intothat subset of 2000 pages for your chosen key phrase otherwise your highPageRank will be completely in vain PageRank means nothing if you do nothave enough ranking from other factors to make it into the first subset
The difference between PageRank and other factors
To assess when PageRank is important and when it is not we need tounderstand how PageRank is different from all other ranking factors To do thisherersquos a quick table that lists a few other factors and how they add to the rankingscore
Title Tag Can only be listed onceKeywords in Body Text Each successive repetition is
less important Proximity isimportant
Anchor Text Highly weighted but likekeywords in body text there isa cut off point where furtheranchor text is no longerworthwhile
PageRank Potentially infinite You arealways capable of increasingyour PageRank significantlybut it takes work
All other ranking factors have cut off points beyond which they will no longeradd to your ranking score or will not add significantly enough for it to beworthwhile PageRank has no cut off point
Non-PageRank Factor Threshold
Having written about the difference between PageRank and other factors andhow PageRank is harder to get it should now be clear that whilst we could usemany methods to get good rankings there is a threshold which defines whenhigh PageRank is worth striving for and when it is not
With ranking factors other than PageRank there is a score beyond which theslow down in the rate that any factor adds to this score is so insignificant that it is
9
not worthwhile This is the Non-PageRank Factor Threshold To illustrate thisletrsquos put an example figure on this of 1000
If we have a query where the results are Page A and Page B then Page A and Bhave scores for that query which are the total scores for all ranking factors(including PageRank) Letrsquos say Page Arsquos score is 900 and page Brsquos score is500 Obviously Page A will be listed first These are both below our hypotheticalNon-PageRank Factor Threshold thus without any change in PageRank it ispossible for page B to improve their optimization to beat Page A for this particularquery There are lots of queries like this on Google theyre more commonlythought of as less competitive queries
Now assume Page A raises its score to 1100 Suddenly page B cannot competein the SERPs (search engine results pages) without increasing its PageRank Inall probability page B must also improve for all the other ranking factors but anincrease in PageRank is almost certainly necessary There are also lots ofqueries like this on Google which are more commonly thought of as morecompetitive queries
Generally when querying Google the group of pages in the SERPs will containsome pages that have a score above the Non-PageRank Factor Threshold andsome that do not
There is an important point to be made here
To be competitive you must raise your pages search engine ranking scorebeyond the Non-PageRank Factor Threshold To fail to do so means that youcan easily be beaten in the search results for your query terms The quickestway to approach the Non-PageRank Factor Threshold is through ldquoon the pagefactorsrdquo however you cannot move above the Non-PageRank FactorThreshold without PageRank
The obvious question is whatrsquos the numerical value of the Non-PageRank FactorThreshold and how much work do you need to do to get past it The answer isthat it has no value it is a hypothetical line Google could put a value on it butthat would not help us unless we know what the pagersquos individual scores are Weneed only be aware that the threshold exists and that it gives us informationabout principles
Using the threshold to derive the worth of two ranking strategies
The threshold explains principles and the different ways that search enginemarketers work It also demonstrates why some of the misunderstandings aboutPageRank occur Letrsquos consider the strategies of two people Person A considersPageRank to be unimportant and Person B considers PageRank to be veryimportant
10
Person A says ldquoPageRankrdquo is unimportant They have optimised pages foryears and know how to use ldquoon the pagerdquo factors very successfully Theyunderstand the basics of anchor text but they couldnrsquot care at all aboutPageRank
Whatrsquos happening person A is reaching the Non-PageRank Factor Thresholdvery quickly because they are maximising the ldquoon the pagerdquo factors Throughcarefully choosing keywords they jump-start themselves up the SERPs Aslong as their content is good high-ranking sites (over time) tend to get linkedto Whilst they didnrsquot directly ask for it a slow trickle of sites will begin to link tothem and give them PageRank which helps consolidates their position
Person B says ldquoPageRankrdquo is important Wersquove all seen those pages in theresults that have no content but great rankings (With big brands this canoften occur naturally even when they have no idea what PageRank is Thiswould be Person C who is not relevant to the discussion at hand) Person Bunderstands lots about PageRank and concentrates heavily on it
Whatrsquos happening person B is doing the reverse of person A Whilst person Aconcentrated on the Non-PageRank factors and found herself gettingPageRank anyway person B concentrates on the PageRank Factor and findshimself getting Non-PageRank factors The reason for this is that increasingPageRank requires links and links have anchor text Thus through carefullychoosing the anchor text linking to his page person B automatically increaseshis Non-PageRank factor scores whilst obtaining his high PageRank score
Obviously these are two extremes but we can use these to extrapolate theadvantages and disadvantages of each approach
11
Advantages DisadvantagesPerson A bull Quick entry to the
results pagesbull Self-generating links
limit the amount ofwork needed
bull Securing the resultsto make it moredifficult forcompetitors tocompete takes morework
bull Slower to react tonew competitors
Person B bull Solid position Caneasily modify on thepage factors for aquick boost ifrequired
bull Probably highertraffic from non-search enginesources
bull Slower entry toresults pages
bull Difficult to do wellbull Increased likelihood
of tripping spamfilters
It is clear that both strategies can and do work Both strategies are usingPageRank as part the mix of factors that will ultimately improve their ranking inthe SERPS Because there is such a mix we can use them to different degreesdepending on the strategy that best suits your style My personal strategy is touse a combination but to save some of the ldquoon the pagerdquo factors for later in caseI need a quick boost if the competition heats up at a later time
Really heavy competition
No explanation of PageRank strategies would be complete without a finalstatement regarding heavy keyword competition There are some queries wherecompetition is so intense that you must do everything possible to maximize yourranking score (for example ldquoweb hostingrdquo) In such situations it is impossible torank highly through Non-PageRank factors alone (as you will not initially be listedhigh enough to be noticed and linked to) That is not to say that Non-PageRankfactors arent important Consider what your final rank score is
Final Rank Score = (score for all Non-PageRank factors) x (actual PageRankscore)
Improving either side of the equation can have a positive effect Howeverbecause the Non-PageRank factors have a restricted maximum benefit theactual PageRank score must be improved in order to compete successfullyUnder really heavy competition ndash it holds true that you cannot rank well unlessyour actual PageRank score is above a certain level In other words
12
There exists a query specific ldquoMinimum PageRank levelrdquo For queries that do nothave heavy competition this level is easy to achieve without even tryingHowever where heavy competition exists Non-PageRank factors are just asimportant (and easier to get) until they reach the Non-PageRank factor thresholdThis is why careful keyword choice can help you avoid the extensive workassociated with highly competitive search phrases
How PageRank is calculated
On a simple level we can tell quite a lot about how PageRank is calculated Thisis because when Google was just a university research project the creators ofPageRank published a paper that detailed a formula for calculating it Thisformula is now more than a handful of years old and we suspect that it haschanged somewhat since then However for detailing the over-riding principlesof how PageRank works it is as accurate today as the day it was written
PR(A) = (1-d) + d (PR(T1) C(T1) + + PR(Tn) C(Tn) )
Where PR(A) is the PageRank of Page A ( the one we want t owork out )
D is a dam pening factor Nom inally this is set to 085
PR(T1) is the PageRank of a site point ing to Page A
C(T1) is the num ber of links off that page
PR(Tn) C(Tn) m eans we do that for each page point ing toPage A
Source The Anatomy of a Large-Scale Hypertextual Web Search EngineSergey Brin and LawrencePage httpwww-dbstanfordedu~backrubgooglehtml
Couldnrsquot be simpler right Depending on your mathematical competence this iseither remarkably easy to understand or remarkably complex So herersquos the dealndash this isnrsquot a one-time calculation It looks pretty above but you canrsquot just do onesimple calculation and get an answer with the PageRank of a page If you look atit to calculate the PageRank of Page A we need to know the PageRank of allthe pages pointing to it To calculate the PageRank of those pages we need toknow the PageRank of all the pages pointing to them (one of which could and isquite likely to be Page A) If this seems like it could go around in circles forever
13
thatrsquos because it nearly does We have to do a whole lot of little calculationsmany times over before we actually get the answer What this formula tells us isthat however you slice it and however the formula may have changed since itwas written
The PageRank given to Page A by a Page B pointing to it is decreased witheach link to anywhere that exists on Page B This means that a pagersquos
PageRank is essentially a measure of its vote it can split that vote between onelink or two or many more but its overall voting power will always remain the
same
This really is important So to be totally clear letrsquos put some numbers to it (thesenumbers are made up for demonstration purposes only and have no relevancyto any particular page) Say Page B has a PageRank value of 5 and has a singlelink on it pointing to Page A Page Arsquos PageRank is improved by a proportion ofPage Brsquos value of 5 (Page B doesnrsquot lose anything but Page A gains) If Page Bhas two links that PageRank improvement would be split and Page A wouldonly gain half the PageRank that it did before
Now put the formula out of your mind for a moment as its easier to understandhow it works using a diagram Letrsquos say we have a hypothetical set of pagesimaginatively titled Page A Page B Page C and Page D They link to each otheras shown below
To begin with in our example at least we donrsquot know what the pagersquos startingPageRanks are Therersquos nothing special about the number that we pick to startwith (in fact if you read forward to the section on convergence ndash you can see that
14
we can start at any number we want) Since in the last version of this paper weperformed the calculations by setting these values to 1 ndash wersquore going to set themto zero this time around in order to prove that it doesnrsquot matter what the startingvalues are
Next we perform the necessary calculation to obtain the PageRank for eachpage The rules are
1 We take a 085 a pagersquos PageRank and divide it by the number oflinks on the page
2 We add that amount on to a new total for each page itrsquos being passedto
3 We add 015 to each of those totals
The first calculation is easy Because wersquove started at zero ndash 0 085 is always 0So each page gets just 015 + 0 Meaning each page now has a PageRank of015 Clearly wersquore not done ndash we want to show the importance of each pagebased upon links and theyrsquore all the same so we need to run the calculationagain
Page A links to pages B C and D Page Arsquos PageRank is 015 so it will add 085 015 = 01275 to the new PageRank scores of the pages it links to There arethree of them so they each get 00425
Page B links to page C Page Brsquos PageRank is 015 so it will add 085 015 =01275 to the new PageRank score of the pages it links to Since it only links topage C page C will get it all
Page C links to Page A all 01275 passes to page A
Page D links to Page C Again all 01275 passes to page C
The new totals for each page them become
Page A 015 (base) + 01275 (from Page C) = 02775Page B 015 (base) + 00425 (from Page A) = 01925Page C 015 (base) + 00425 (from Page A) + 01275 (from Page B) + 01275(from Page D) = 04475Page D 015 (base) + 00425 (from Page A) = 01925
So wersquove got
15
Pretty neat huh Already wersquore begining to see that Page C is probably the mostimportant page in the system (but we canrsquot be sure yet ndash it could well change)We carry on doing these calculations until the value for each page no longerchanges (this is called convergence ndash therersquos more about this in the nextsection) In practice Google probably doesnt wait for this convergence butinstead run a number of iterations of the calculation which is likely to give themfairly accurate values (more on this later as well) If we carried out all thecalculations for the example given it would take us 143 calculations [ ExcelExample 1] and wersquod reach final values of
16
As suspected Page C is the most important If we take a quick look at these rawvalues we can see something about the number of links pointing out from a pageLook at Page A which has a link from a high PageRank page (Page C) whichhas only one outbound link Then look at Page B and D both share links from ahigh PageRank page (Page A) with three outbound links The number of linkssignificantly alters the way PageRank is distributed
A word on convergence
Convergence is an important mathematical aspect of PageRank which allowsGoogle to provided unprecedented search quality at comparably low costs Thisis a semi-complex topic but it is important to your understanding of how and whyPageRank works Wersquove tried to make it as simple as possible but unless yoursquoreSergey Brin or Larry Page yoursquoll still need to concentrate
Whilst it will take some concentration it isnrsquot that hard to understand
Wersquove shown that the outlet values (final values) of one stage of the calculationbecome the inlet values (starting values) of the next stage and that we keep ondoing this (itrsquos known as a recursive procedure) But the really big question ishow and when does the recursive procedure stop
The answer is ldquoconvergencerdquo Provided the dampening factor (d in our equation)is less than one then convergence will occur Nominally we set it to 085(because thatrsquos the value mentioned in the Stanford papers)
This convergence basically means that whatever values we start at after runningthe calculation a number of times we will end up with the same final values andthat these values will no longer change if we do further iterations of thecalculation These final values are known as limiting values
Once the limiting values have been reached Google no longer needs to expendprocessing power on calculating the PageRank They can finish there
This is easier to understand with an example Letrsquos take a look at the followingstructure
17
PageRank for pages A B C D at various stages of iteration
A B C DIteration 1 1 1 1
1 10000000000 05750000000 22750000000 015000000002 20837500000 05750000000 11912500000 015000000003 11625625000 10355937500 16518437500 01500000000
19 14900124031 07833688246 15766187723 0150000000020 14901259564 07832552713 15766187723 0150000000021 14901259564 07833035315 15765705121 0150000000046 14901074052 07832956473 15765969475 0150000000047 14901074054 07832956472 15765969474 0150000000048 14901074053 07832956473 15765969474 0150000000049 14901074053 07832956473 15765969474 0150000000050 14901074053 07832956473 15765969474 0150000000051 14901074053 07832956473 15765969474 0150000000052 14901074053 07832956473 15765969474 0150000000053 14901074053 07832956473 15765969474 0150000000054 14901074053 07832956473 15765969474 01500000000
No matter how much we try to calculate the value past the 48th iteration thevalues do not change ndash they have converged at the limiting values
In practice theres no need to wait until they do not change at all before we stopcalculations We simply wait until they do not change very much In our exampleabove we have done that to 10 decimal places Ie we have calculated until theydo not change by 00000000001 or more (or mathematically | PR 48 (A) - PR 50(A) | lt 10 ndash10)
18
Derived specifics of how Google calculates PageRank
We started our values from zero and now see that the calculations are fairlycomplicated and resource intensive So two questions arise When Googlecalculates PageRank themselves do they start each month with an arbitraryvalue And secondly do they calculate PageRank across several machines Wecan answer these questions by a couple of simple experiments Assume thathaving calculated the PageRanks in the structure shown earlier we would like toadd a page and change a link to make the structure look like this
If we start each page from an arbitrary starting value of one then how manyiterations of the calculation does it take to achieve convergence [ ExcelExample 2]
75
Now if we start each page from the values we ended up with when we calculatedthe PageRank of the original structure (and set Page Ersquos starting value to 1) thenhow many iterations does it take to achieve convergence [ Excel Example 3]
78
These numbers are really very close Therefore it would appear logical thatGoogle does not reset the values with each calculation as they may simply carryon from where they left off
So do they calculate across several machines or one Obviously there is toomuch work for one machine when you get to billions of pages so several
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
9
not worthwhile This is the Non-PageRank Factor Threshold To illustrate thisletrsquos put an example figure on this of 1000
If we have a query where the results are Page A and Page B then Page A and Bhave scores for that query which are the total scores for all ranking factors(including PageRank) Letrsquos say Page Arsquos score is 900 and page Brsquos score is500 Obviously Page A will be listed first These are both below our hypotheticalNon-PageRank Factor Threshold thus without any change in PageRank it ispossible for page B to improve their optimization to beat Page A for this particularquery There are lots of queries like this on Google theyre more commonlythought of as less competitive queries
Now assume Page A raises its score to 1100 Suddenly page B cannot competein the SERPs (search engine results pages) without increasing its PageRank Inall probability page B must also improve for all the other ranking factors but anincrease in PageRank is almost certainly necessary There are also lots ofqueries like this on Google which are more commonly thought of as morecompetitive queries
Generally when querying Google the group of pages in the SERPs will containsome pages that have a score above the Non-PageRank Factor Threshold andsome that do not
There is an important point to be made here
To be competitive you must raise your pages search engine ranking scorebeyond the Non-PageRank Factor Threshold To fail to do so means that youcan easily be beaten in the search results for your query terms The quickestway to approach the Non-PageRank Factor Threshold is through ldquoon the pagefactorsrdquo however you cannot move above the Non-PageRank FactorThreshold without PageRank
The obvious question is whatrsquos the numerical value of the Non-PageRank FactorThreshold and how much work do you need to do to get past it The answer isthat it has no value it is a hypothetical line Google could put a value on it butthat would not help us unless we know what the pagersquos individual scores are Weneed only be aware that the threshold exists and that it gives us informationabout principles
Using the threshold to derive the worth of two ranking strategies
The threshold explains principles and the different ways that search enginemarketers work It also demonstrates why some of the misunderstandings aboutPageRank occur Letrsquos consider the strategies of two people Person A considersPageRank to be unimportant and Person B considers PageRank to be veryimportant
10
Person A says ldquoPageRankrdquo is unimportant They have optimised pages foryears and know how to use ldquoon the pagerdquo factors very successfully Theyunderstand the basics of anchor text but they couldnrsquot care at all aboutPageRank
Whatrsquos happening person A is reaching the Non-PageRank Factor Thresholdvery quickly because they are maximising the ldquoon the pagerdquo factors Throughcarefully choosing keywords they jump-start themselves up the SERPs Aslong as their content is good high-ranking sites (over time) tend to get linkedto Whilst they didnrsquot directly ask for it a slow trickle of sites will begin to link tothem and give them PageRank which helps consolidates their position
Person B says ldquoPageRankrdquo is important Wersquove all seen those pages in theresults that have no content but great rankings (With big brands this canoften occur naturally even when they have no idea what PageRank is Thiswould be Person C who is not relevant to the discussion at hand) Person Bunderstands lots about PageRank and concentrates heavily on it
Whatrsquos happening person B is doing the reverse of person A Whilst person Aconcentrated on the Non-PageRank factors and found herself gettingPageRank anyway person B concentrates on the PageRank Factor and findshimself getting Non-PageRank factors The reason for this is that increasingPageRank requires links and links have anchor text Thus through carefullychoosing the anchor text linking to his page person B automatically increaseshis Non-PageRank factor scores whilst obtaining his high PageRank score
Obviously these are two extremes but we can use these to extrapolate theadvantages and disadvantages of each approach
11
Advantages DisadvantagesPerson A bull Quick entry to the
results pagesbull Self-generating links
limit the amount ofwork needed
bull Securing the resultsto make it moredifficult forcompetitors tocompete takes morework
bull Slower to react tonew competitors
Person B bull Solid position Caneasily modify on thepage factors for aquick boost ifrequired
bull Probably highertraffic from non-search enginesources
bull Slower entry toresults pages
bull Difficult to do wellbull Increased likelihood
of tripping spamfilters
It is clear that both strategies can and do work Both strategies are usingPageRank as part the mix of factors that will ultimately improve their ranking inthe SERPS Because there is such a mix we can use them to different degreesdepending on the strategy that best suits your style My personal strategy is touse a combination but to save some of the ldquoon the pagerdquo factors for later in caseI need a quick boost if the competition heats up at a later time
Really heavy competition
No explanation of PageRank strategies would be complete without a finalstatement regarding heavy keyword competition There are some queries wherecompetition is so intense that you must do everything possible to maximize yourranking score (for example ldquoweb hostingrdquo) In such situations it is impossible torank highly through Non-PageRank factors alone (as you will not initially be listedhigh enough to be noticed and linked to) That is not to say that Non-PageRankfactors arent important Consider what your final rank score is
Final Rank Score = (score for all Non-PageRank factors) x (actual PageRankscore)
Improving either side of the equation can have a positive effect Howeverbecause the Non-PageRank factors have a restricted maximum benefit theactual PageRank score must be improved in order to compete successfullyUnder really heavy competition ndash it holds true that you cannot rank well unlessyour actual PageRank score is above a certain level In other words
12
There exists a query specific ldquoMinimum PageRank levelrdquo For queries that do nothave heavy competition this level is easy to achieve without even tryingHowever where heavy competition exists Non-PageRank factors are just asimportant (and easier to get) until they reach the Non-PageRank factor thresholdThis is why careful keyword choice can help you avoid the extensive workassociated with highly competitive search phrases
How PageRank is calculated
On a simple level we can tell quite a lot about how PageRank is calculated Thisis because when Google was just a university research project the creators ofPageRank published a paper that detailed a formula for calculating it Thisformula is now more than a handful of years old and we suspect that it haschanged somewhat since then However for detailing the over-riding principlesof how PageRank works it is as accurate today as the day it was written
PR(A) = (1-d) + d (PR(T1) C(T1) + + PR(Tn) C(Tn) )
Where PR(A) is the PageRank of Page A ( the one we want t owork out )
D is a dam pening factor Nom inally this is set to 085
PR(T1) is the PageRank of a site point ing to Page A
C(T1) is the num ber of links off that page
PR(Tn) C(Tn) m eans we do that for each page point ing toPage A
Source The Anatomy of a Large-Scale Hypertextual Web Search EngineSergey Brin and LawrencePage httpwww-dbstanfordedu~backrubgooglehtml
Couldnrsquot be simpler right Depending on your mathematical competence this iseither remarkably easy to understand or remarkably complex So herersquos the dealndash this isnrsquot a one-time calculation It looks pretty above but you canrsquot just do onesimple calculation and get an answer with the PageRank of a page If you look atit to calculate the PageRank of Page A we need to know the PageRank of allthe pages pointing to it To calculate the PageRank of those pages we need toknow the PageRank of all the pages pointing to them (one of which could and isquite likely to be Page A) If this seems like it could go around in circles forever
13
thatrsquos because it nearly does We have to do a whole lot of little calculationsmany times over before we actually get the answer What this formula tells us isthat however you slice it and however the formula may have changed since itwas written
The PageRank given to Page A by a Page B pointing to it is decreased witheach link to anywhere that exists on Page B This means that a pagersquos
PageRank is essentially a measure of its vote it can split that vote between onelink or two or many more but its overall voting power will always remain the
same
This really is important So to be totally clear letrsquos put some numbers to it (thesenumbers are made up for demonstration purposes only and have no relevancyto any particular page) Say Page B has a PageRank value of 5 and has a singlelink on it pointing to Page A Page Arsquos PageRank is improved by a proportion ofPage Brsquos value of 5 (Page B doesnrsquot lose anything but Page A gains) If Page Bhas two links that PageRank improvement would be split and Page A wouldonly gain half the PageRank that it did before
Now put the formula out of your mind for a moment as its easier to understandhow it works using a diagram Letrsquos say we have a hypothetical set of pagesimaginatively titled Page A Page B Page C and Page D They link to each otheras shown below
To begin with in our example at least we donrsquot know what the pagersquos startingPageRanks are Therersquos nothing special about the number that we pick to startwith (in fact if you read forward to the section on convergence ndash you can see that
14
we can start at any number we want) Since in the last version of this paper weperformed the calculations by setting these values to 1 ndash wersquore going to set themto zero this time around in order to prove that it doesnrsquot matter what the startingvalues are
Next we perform the necessary calculation to obtain the PageRank for eachpage The rules are
1 We take a 085 a pagersquos PageRank and divide it by the number oflinks on the page
2 We add that amount on to a new total for each page itrsquos being passedto
3 We add 015 to each of those totals
The first calculation is easy Because wersquove started at zero ndash 0 085 is always 0So each page gets just 015 + 0 Meaning each page now has a PageRank of015 Clearly wersquore not done ndash we want to show the importance of each pagebased upon links and theyrsquore all the same so we need to run the calculationagain
Page A links to pages B C and D Page Arsquos PageRank is 015 so it will add 085 015 = 01275 to the new PageRank scores of the pages it links to There arethree of them so they each get 00425
Page B links to page C Page Brsquos PageRank is 015 so it will add 085 015 =01275 to the new PageRank score of the pages it links to Since it only links topage C page C will get it all
Page C links to Page A all 01275 passes to page A
Page D links to Page C Again all 01275 passes to page C
The new totals for each page them become
Page A 015 (base) + 01275 (from Page C) = 02775Page B 015 (base) + 00425 (from Page A) = 01925Page C 015 (base) + 00425 (from Page A) + 01275 (from Page B) + 01275(from Page D) = 04475Page D 015 (base) + 00425 (from Page A) = 01925
So wersquove got
15
Pretty neat huh Already wersquore begining to see that Page C is probably the mostimportant page in the system (but we canrsquot be sure yet ndash it could well change)We carry on doing these calculations until the value for each page no longerchanges (this is called convergence ndash therersquos more about this in the nextsection) In practice Google probably doesnt wait for this convergence butinstead run a number of iterations of the calculation which is likely to give themfairly accurate values (more on this later as well) If we carried out all thecalculations for the example given it would take us 143 calculations [ ExcelExample 1] and wersquod reach final values of
16
As suspected Page C is the most important If we take a quick look at these rawvalues we can see something about the number of links pointing out from a pageLook at Page A which has a link from a high PageRank page (Page C) whichhas only one outbound link Then look at Page B and D both share links from ahigh PageRank page (Page A) with three outbound links The number of linkssignificantly alters the way PageRank is distributed
A word on convergence
Convergence is an important mathematical aspect of PageRank which allowsGoogle to provided unprecedented search quality at comparably low costs Thisis a semi-complex topic but it is important to your understanding of how and whyPageRank works Wersquove tried to make it as simple as possible but unless yoursquoreSergey Brin or Larry Page yoursquoll still need to concentrate
Whilst it will take some concentration it isnrsquot that hard to understand
Wersquove shown that the outlet values (final values) of one stage of the calculationbecome the inlet values (starting values) of the next stage and that we keep ondoing this (itrsquos known as a recursive procedure) But the really big question ishow and when does the recursive procedure stop
The answer is ldquoconvergencerdquo Provided the dampening factor (d in our equation)is less than one then convergence will occur Nominally we set it to 085(because thatrsquos the value mentioned in the Stanford papers)
This convergence basically means that whatever values we start at after runningthe calculation a number of times we will end up with the same final values andthat these values will no longer change if we do further iterations of thecalculation These final values are known as limiting values
Once the limiting values have been reached Google no longer needs to expendprocessing power on calculating the PageRank They can finish there
This is easier to understand with an example Letrsquos take a look at the followingstructure
17
PageRank for pages A B C D at various stages of iteration
A B C DIteration 1 1 1 1
1 10000000000 05750000000 22750000000 015000000002 20837500000 05750000000 11912500000 015000000003 11625625000 10355937500 16518437500 01500000000
19 14900124031 07833688246 15766187723 0150000000020 14901259564 07832552713 15766187723 0150000000021 14901259564 07833035315 15765705121 0150000000046 14901074052 07832956473 15765969475 0150000000047 14901074054 07832956472 15765969474 0150000000048 14901074053 07832956473 15765969474 0150000000049 14901074053 07832956473 15765969474 0150000000050 14901074053 07832956473 15765969474 0150000000051 14901074053 07832956473 15765969474 0150000000052 14901074053 07832956473 15765969474 0150000000053 14901074053 07832956473 15765969474 0150000000054 14901074053 07832956473 15765969474 01500000000
No matter how much we try to calculate the value past the 48th iteration thevalues do not change ndash they have converged at the limiting values
In practice theres no need to wait until they do not change at all before we stopcalculations We simply wait until they do not change very much In our exampleabove we have done that to 10 decimal places Ie we have calculated until theydo not change by 00000000001 or more (or mathematically | PR 48 (A) - PR 50(A) | lt 10 ndash10)
18
Derived specifics of how Google calculates PageRank
We started our values from zero and now see that the calculations are fairlycomplicated and resource intensive So two questions arise When Googlecalculates PageRank themselves do they start each month with an arbitraryvalue And secondly do they calculate PageRank across several machines Wecan answer these questions by a couple of simple experiments Assume thathaving calculated the PageRanks in the structure shown earlier we would like toadd a page and change a link to make the structure look like this
If we start each page from an arbitrary starting value of one then how manyiterations of the calculation does it take to achieve convergence [ ExcelExample 2]
75
Now if we start each page from the values we ended up with when we calculatedthe PageRank of the original structure (and set Page Ersquos starting value to 1) thenhow many iterations does it take to achieve convergence [ Excel Example 3]
78
These numbers are really very close Therefore it would appear logical thatGoogle does not reset the values with each calculation as they may simply carryon from where they left off
So do they calculate across several machines or one Obviously there is toomuch work for one machine when you get to billions of pages so several
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
10
Person A says ldquoPageRankrdquo is unimportant They have optimised pages foryears and know how to use ldquoon the pagerdquo factors very successfully Theyunderstand the basics of anchor text but they couldnrsquot care at all aboutPageRank
Whatrsquos happening person A is reaching the Non-PageRank Factor Thresholdvery quickly because they are maximising the ldquoon the pagerdquo factors Throughcarefully choosing keywords they jump-start themselves up the SERPs Aslong as their content is good high-ranking sites (over time) tend to get linkedto Whilst they didnrsquot directly ask for it a slow trickle of sites will begin to link tothem and give them PageRank which helps consolidates their position
Person B says ldquoPageRankrdquo is important Wersquove all seen those pages in theresults that have no content but great rankings (With big brands this canoften occur naturally even when they have no idea what PageRank is Thiswould be Person C who is not relevant to the discussion at hand) Person Bunderstands lots about PageRank and concentrates heavily on it
Whatrsquos happening person B is doing the reverse of person A Whilst person Aconcentrated on the Non-PageRank factors and found herself gettingPageRank anyway person B concentrates on the PageRank Factor and findshimself getting Non-PageRank factors The reason for this is that increasingPageRank requires links and links have anchor text Thus through carefullychoosing the anchor text linking to his page person B automatically increaseshis Non-PageRank factor scores whilst obtaining his high PageRank score
Obviously these are two extremes but we can use these to extrapolate theadvantages and disadvantages of each approach
11
Advantages DisadvantagesPerson A bull Quick entry to the
results pagesbull Self-generating links
limit the amount ofwork needed
bull Securing the resultsto make it moredifficult forcompetitors tocompete takes morework
bull Slower to react tonew competitors
Person B bull Solid position Caneasily modify on thepage factors for aquick boost ifrequired
bull Probably highertraffic from non-search enginesources
bull Slower entry toresults pages
bull Difficult to do wellbull Increased likelihood
of tripping spamfilters
It is clear that both strategies can and do work Both strategies are usingPageRank as part the mix of factors that will ultimately improve their ranking inthe SERPS Because there is such a mix we can use them to different degreesdepending on the strategy that best suits your style My personal strategy is touse a combination but to save some of the ldquoon the pagerdquo factors for later in caseI need a quick boost if the competition heats up at a later time
Really heavy competition
No explanation of PageRank strategies would be complete without a finalstatement regarding heavy keyword competition There are some queries wherecompetition is so intense that you must do everything possible to maximize yourranking score (for example ldquoweb hostingrdquo) In such situations it is impossible torank highly through Non-PageRank factors alone (as you will not initially be listedhigh enough to be noticed and linked to) That is not to say that Non-PageRankfactors arent important Consider what your final rank score is
Final Rank Score = (score for all Non-PageRank factors) x (actual PageRankscore)
Improving either side of the equation can have a positive effect Howeverbecause the Non-PageRank factors have a restricted maximum benefit theactual PageRank score must be improved in order to compete successfullyUnder really heavy competition ndash it holds true that you cannot rank well unlessyour actual PageRank score is above a certain level In other words
12
There exists a query specific ldquoMinimum PageRank levelrdquo For queries that do nothave heavy competition this level is easy to achieve without even tryingHowever where heavy competition exists Non-PageRank factors are just asimportant (and easier to get) until they reach the Non-PageRank factor thresholdThis is why careful keyword choice can help you avoid the extensive workassociated with highly competitive search phrases
How PageRank is calculated
On a simple level we can tell quite a lot about how PageRank is calculated Thisis because when Google was just a university research project the creators ofPageRank published a paper that detailed a formula for calculating it Thisformula is now more than a handful of years old and we suspect that it haschanged somewhat since then However for detailing the over-riding principlesof how PageRank works it is as accurate today as the day it was written
PR(A) = (1-d) + d (PR(T1) C(T1) + + PR(Tn) C(Tn) )
Where PR(A) is the PageRank of Page A ( the one we want t owork out )
D is a dam pening factor Nom inally this is set to 085
PR(T1) is the PageRank of a site point ing to Page A
C(T1) is the num ber of links off that page
PR(Tn) C(Tn) m eans we do that for each page point ing toPage A
Source The Anatomy of a Large-Scale Hypertextual Web Search EngineSergey Brin and LawrencePage httpwww-dbstanfordedu~backrubgooglehtml
Couldnrsquot be simpler right Depending on your mathematical competence this iseither remarkably easy to understand or remarkably complex So herersquos the dealndash this isnrsquot a one-time calculation It looks pretty above but you canrsquot just do onesimple calculation and get an answer with the PageRank of a page If you look atit to calculate the PageRank of Page A we need to know the PageRank of allthe pages pointing to it To calculate the PageRank of those pages we need toknow the PageRank of all the pages pointing to them (one of which could and isquite likely to be Page A) If this seems like it could go around in circles forever
13
thatrsquos because it nearly does We have to do a whole lot of little calculationsmany times over before we actually get the answer What this formula tells us isthat however you slice it and however the formula may have changed since itwas written
The PageRank given to Page A by a Page B pointing to it is decreased witheach link to anywhere that exists on Page B This means that a pagersquos
PageRank is essentially a measure of its vote it can split that vote between onelink or two or many more but its overall voting power will always remain the
same
This really is important So to be totally clear letrsquos put some numbers to it (thesenumbers are made up for demonstration purposes only and have no relevancyto any particular page) Say Page B has a PageRank value of 5 and has a singlelink on it pointing to Page A Page Arsquos PageRank is improved by a proportion ofPage Brsquos value of 5 (Page B doesnrsquot lose anything but Page A gains) If Page Bhas two links that PageRank improvement would be split and Page A wouldonly gain half the PageRank that it did before
Now put the formula out of your mind for a moment as its easier to understandhow it works using a diagram Letrsquos say we have a hypothetical set of pagesimaginatively titled Page A Page B Page C and Page D They link to each otheras shown below
To begin with in our example at least we donrsquot know what the pagersquos startingPageRanks are Therersquos nothing special about the number that we pick to startwith (in fact if you read forward to the section on convergence ndash you can see that
14
we can start at any number we want) Since in the last version of this paper weperformed the calculations by setting these values to 1 ndash wersquore going to set themto zero this time around in order to prove that it doesnrsquot matter what the startingvalues are
Next we perform the necessary calculation to obtain the PageRank for eachpage The rules are
1 We take a 085 a pagersquos PageRank and divide it by the number oflinks on the page
2 We add that amount on to a new total for each page itrsquos being passedto
3 We add 015 to each of those totals
The first calculation is easy Because wersquove started at zero ndash 0 085 is always 0So each page gets just 015 + 0 Meaning each page now has a PageRank of015 Clearly wersquore not done ndash we want to show the importance of each pagebased upon links and theyrsquore all the same so we need to run the calculationagain
Page A links to pages B C and D Page Arsquos PageRank is 015 so it will add 085 015 = 01275 to the new PageRank scores of the pages it links to There arethree of them so they each get 00425
Page B links to page C Page Brsquos PageRank is 015 so it will add 085 015 =01275 to the new PageRank score of the pages it links to Since it only links topage C page C will get it all
Page C links to Page A all 01275 passes to page A
Page D links to Page C Again all 01275 passes to page C
The new totals for each page them become
Page A 015 (base) + 01275 (from Page C) = 02775Page B 015 (base) + 00425 (from Page A) = 01925Page C 015 (base) + 00425 (from Page A) + 01275 (from Page B) + 01275(from Page D) = 04475Page D 015 (base) + 00425 (from Page A) = 01925
So wersquove got
15
Pretty neat huh Already wersquore begining to see that Page C is probably the mostimportant page in the system (but we canrsquot be sure yet ndash it could well change)We carry on doing these calculations until the value for each page no longerchanges (this is called convergence ndash therersquos more about this in the nextsection) In practice Google probably doesnt wait for this convergence butinstead run a number of iterations of the calculation which is likely to give themfairly accurate values (more on this later as well) If we carried out all thecalculations for the example given it would take us 143 calculations [ ExcelExample 1] and wersquod reach final values of
16
As suspected Page C is the most important If we take a quick look at these rawvalues we can see something about the number of links pointing out from a pageLook at Page A which has a link from a high PageRank page (Page C) whichhas only one outbound link Then look at Page B and D both share links from ahigh PageRank page (Page A) with three outbound links The number of linkssignificantly alters the way PageRank is distributed
A word on convergence
Convergence is an important mathematical aspect of PageRank which allowsGoogle to provided unprecedented search quality at comparably low costs Thisis a semi-complex topic but it is important to your understanding of how and whyPageRank works Wersquove tried to make it as simple as possible but unless yoursquoreSergey Brin or Larry Page yoursquoll still need to concentrate
Whilst it will take some concentration it isnrsquot that hard to understand
Wersquove shown that the outlet values (final values) of one stage of the calculationbecome the inlet values (starting values) of the next stage and that we keep ondoing this (itrsquos known as a recursive procedure) But the really big question ishow and when does the recursive procedure stop
The answer is ldquoconvergencerdquo Provided the dampening factor (d in our equation)is less than one then convergence will occur Nominally we set it to 085(because thatrsquos the value mentioned in the Stanford papers)
This convergence basically means that whatever values we start at after runningthe calculation a number of times we will end up with the same final values andthat these values will no longer change if we do further iterations of thecalculation These final values are known as limiting values
Once the limiting values have been reached Google no longer needs to expendprocessing power on calculating the PageRank They can finish there
This is easier to understand with an example Letrsquos take a look at the followingstructure
17
PageRank for pages A B C D at various stages of iteration
A B C DIteration 1 1 1 1
1 10000000000 05750000000 22750000000 015000000002 20837500000 05750000000 11912500000 015000000003 11625625000 10355937500 16518437500 01500000000
19 14900124031 07833688246 15766187723 0150000000020 14901259564 07832552713 15766187723 0150000000021 14901259564 07833035315 15765705121 0150000000046 14901074052 07832956473 15765969475 0150000000047 14901074054 07832956472 15765969474 0150000000048 14901074053 07832956473 15765969474 0150000000049 14901074053 07832956473 15765969474 0150000000050 14901074053 07832956473 15765969474 0150000000051 14901074053 07832956473 15765969474 0150000000052 14901074053 07832956473 15765969474 0150000000053 14901074053 07832956473 15765969474 0150000000054 14901074053 07832956473 15765969474 01500000000
No matter how much we try to calculate the value past the 48th iteration thevalues do not change ndash they have converged at the limiting values
In practice theres no need to wait until they do not change at all before we stopcalculations We simply wait until they do not change very much In our exampleabove we have done that to 10 decimal places Ie we have calculated until theydo not change by 00000000001 or more (or mathematically | PR 48 (A) - PR 50(A) | lt 10 ndash10)
18
Derived specifics of how Google calculates PageRank
We started our values from zero and now see that the calculations are fairlycomplicated and resource intensive So two questions arise When Googlecalculates PageRank themselves do they start each month with an arbitraryvalue And secondly do they calculate PageRank across several machines Wecan answer these questions by a couple of simple experiments Assume thathaving calculated the PageRanks in the structure shown earlier we would like toadd a page and change a link to make the structure look like this
If we start each page from an arbitrary starting value of one then how manyiterations of the calculation does it take to achieve convergence [ ExcelExample 2]
75
Now if we start each page from the values we ended up with when we calculatedthe PageRank of the original structure (and set Page Ersquos starting value to 1) thenhow many iterations does it take to achieve convergence [ Excel Example 3]
78
These numbers are really very close Therefore it would appear logical thatGoogle does not reset the values with each calculation as they may simply carryon from where they left off
So do they calculate across several machines or one Obviously there is toomuch work for one machine when you get to billions of pages so several
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
11
Advantages DisadvantagesPerson A bull Quick entry to the
results pagesbull Self-generating links
limit the amount ofwork needed
bull Securing the resultsto make it moredifficult forcompetitors tocompete takes morework
bull Slower to react tonew competitors
Person B bull Solid position Caneasily modify on thepage factors for aquick boost ifrequired
bull Probably highertraffic from non-search enginesources
bull Slower entry toresults pages
bull Difficult to do wellbull Increased likelihood
of tripping spamfilters
It is clear that both strategies can and do work Both strategies are usingPageRank as part the mix of factors that will ultimately improve their ranking inthe SERPS Because there is such a mix we can use them to different degreesdepending on the strategy that best suits your style My personal strategy is touse a combination but to save some of the ldquoon the pagerdquo factors for later in caseI need a quick boost if the competition heats up at a later time
Really heavy competition
No explanation of PageRank strategies would be complete without a finalstatement regarding heavy keyword competition There are some queries wherecompetition is so intense that you must do everything possible to maximize yourranking score (for example ldquoweb hostingrdquo) In such situations it is impossible torank highly through Non-PageRank factors alone (as you will not initially be listedhigh enough to be noticed and linked to) That is not to say that Non-PageRankfactors arent important Consider what your final rank score is
Final Rank Score = (score for all Non-PageRank factors) x (actual PageRankscore)
Improving either side of the equation can have a positive effect Howeverbecause the Non-PageRank factors have a restricted maximum benefit theactual PageRank score must be improved in order to compete successfullyUnder really heavy competition ndash it holds true that you cannot rank well unlessyour actual PageRank score is above a certain level In other words
12
There exists a query specific ldquoMinimum PageRank levelrdquo For queries that do nothave heavy competition this level is easy to achieve without even tryingHowever where heavy competition exists Non-PageRank factors are just asimportant (and easier to get) until they reach the Non-PageRank factor thresholdThis is why careful keyword choice can help you avoid the extensive workassociated with highly competitive search phrases
How PageRank is calculated
On a simple level we can tell quite a lot about how PageRank is calculated Thisis because when Google was just a university research project the creators ofPageRank published a paper that detailed a formula for calculating it Thisformula is now more than a handful of years old and we suspect that it haschanged somewhat since then However for detailing the over-riding principlesof how PageRank works it is as accurate today as the day it was written
PR(A) = (1-d) + d (PR(T1) C(T1) + + PR(Tn) C(Tn) )
Where PR(A) is the PageRank of Page A ( the one we want t owork out )
D is a dam pening factor Nom inally this is set to 085
PR(T1) is the PageRank of a site point ing to Page A
C(T1) is the num ber of links off that page
PR(Tn) C(Tn) m eans we do that for each page point ing toPage A
Source The Anatomy of a Large-Scale Hypertextual Web Search EngineSergey Brin and LawrencePage httpwww-dbstanfordedu~backrubgooglehtml
Couldnrsquot be simpler right Depending on your mathematical competence this iseither remarkably easy to understand or remarkably complex So herersquos the dealndash this isnrsquot a one-time calculation It looks pretty above but you canrsquot just do onesimple calculation and get an answer with the PageRank of a page If you look atit to calculate the PageRank of Page A we need to know the PageRank of allthe pages pointing to it To calculate the PageRank of those pages we need toknow the PageRank of all the pages pointing to them (one of which could and isquite likely to be Page A) If this seems like it could go around in circles forever
13
thatrsquos because it nearly does We have to do a whole lot of little calculationsmany times over before we actually get the answer What this formula tells us isthat however you slice it and however the formula may have changed since itwas written
The PageRank given to Page A by a Page B pointing to it is decreased witheach link to anywhere that exists on Page B This means that a pagersquos
PageRank is essentially a measure of its vote it can split that vote between onelink or two or many more but its overall voting power will always remain the
same
This really is important So to be totally clear letrsquos put some numbers to it (thesenumbers are made up for demonstration purposes only and have no relevancyto any particular page) Say Page B has a PageRank value of 5 and has a singlelink on it pointing to Page A Page Arsquos PageRank is improved by a proportion ofPage Brsquos value of 5 (Page B doesnrsquot lose anything but Page A gains) If Page Bhas two links that PageRank improvement would be split and Page A wouldonly gain half the PageRank that it did before
Now put the formula out of your mind for a moment as its easier to understandhow it works using a diagram Letrsquos say we have a hypothetical set of pagesimaginatively titled Page A Page B Page C and Page D They link to each otheras shown below
To begin with in our example at least we donrsquot know what the pagersquos startingPageRanks are Therersquos nothing special about the number that we pick to startwith (in fact if you read forward to the section on convergence ndash you can see that
14
we can start at any number we want) Since in the last version of this paper weperformed the calculations by setting these values to 1 ndash wersquore going to set themto zero this time around in order to prove that it doesnrsquot matter what the startingvalues are
Next we perform the necessary calculation to obtain the PageRank for eachpage The rules are
1 We take a 085 a pagersquos PageRank and divide it by the number oflinks on the page
2 We add that amount on to a new total for each page itrsquos being passedto
3 We add 015 to each of those totals
The first calculation is easy Because wersquove started at zero ndash 0 085 is always 0So each page gets just 015 + 0 Meaning each page now has a PageRank of015 Clearly wersquore not done ndash we want to show the importance of each pagebased upon links and theyrsquore all the same so we need to run the calculationagain
Page A links to pages B C and D Page Arsquos PageRank is 015 so it will add 085 015 = 01275 to the new PageRank scores of the pages it links to There arethree of them so they each get 00425
Page B links to page C Page Brsquos PageRank is 015 so it will add 085 015 =01275 to the new PageRank score of the pages it links to Since it only links topage C page C will get it all
Page C links to Page A all 01275 passes to page A
Page D links to Page C Again all 01275 passes to page C
The new totals for each page them become
Page A 015 (base) + 01275 (from Page C) = 02775Page B 015 (base) + 00425 (from Page A) = 01925Page C 015 (base) + 00425 (from Page A) + 01275 (from Page B) + 01275(from Page D) = 04475Page D 015 (base) + 00425 (from Page A) = 01925
So wersquove got
15
Pretty neat huh Already wersquore begining to see that Page C is probably the mostimportant page in the system (but we canrsquot be sure yet ndash it could well change)We carry on doing these calculations until the value for each page no longerchanges (this is called convergence ndash therersquos more about this in the nextsection) In practice Google probably doesnt wait for this convergence butinstead run a number of iterations of the calculation which is likely to give themfairly accurate values (more on this later as well) If we carried out all thecalculations for the example given it would take us 143 calculations [ ExcelExample 1] and wersquod reach final values of
16
As suspected Page C is the most important If we take a quick look at these rawvalues we can see something about the number of links pointing out from a pageLook at Page A which has a link from a high PageRank page (Page C) whichhas only one outbound link Then look at Page B and D both share links from ahigh PageRank page (Page A) with three outbound links The number of linkssignificantly alters the way PageRank is distributed
A word on convergence
Convergence is an important mathematical aspect of PageRank which allowsGoogle to provided unprecedented search quality at comparably low costs Thisis a semi-complex topic but it is important to your understanding of how and whyPageRank works Wersquove tried to make it as simple as possible but unless yoursquoreSergey Brin or Larry Page yoursquoll still need to concentrate
Whilst it will take some concentration it isnrsquot that hard to understand
Wersquove shown that the outlet values (final values) of one stage of the calculationbecome the inlet values (starting values) of the next stage and that we keep ondoing this (itrsquos known as a recursive procedure) But the really big question ishow and when does the recursive procedure stop
The answer is ldquoconvergencerdquo Provided the dampening factor (d in our equation)is less than one then convergence will occur Nominally we set it to 085(because thatrsquos the value mentioned in the Stanford papers)
This convergence basically means that whatever values we start at after runningthe calculation a number of times we will end up with the same final values andthat these values will no longer change if we do further iterations of thecalculation These final values are known as limiting values
Once the limiting values have been reached Google no longer needs to expendprocessing power on calculating the PageRank They can finish there
This is easier to understand with an example Letrsquos take a look at the followingstructure
17
PageRank for pages A B C D at various stages of iteration
A B C DIteration 1 1 1 1
1 10000000000 05750000000 22750000000 015000000002 20837500000 05750000000 11912500000 015000000003 11625625000 10355937500 16518437500 01500000000
19 14900124031 07833688246 15766187723 0150000000020 14901259564 07832552713 15766187723 0150000000021 14901259564 07833035315 15765705121 0150000000046 14901074052 07832956473 15765969475 0150000000047 14901074054 07832956472 15765969474 0150000000048 14901074053 07832956473 15765969474 0150000000049 14901074053 07832956473 15765969474 0150000000050 14901074053 07832956473 15765969474 0150000000051 14901074053 07832956473 15765969474 0150000000052 14901074053 07832956473 15765969474 0150000000053 14901074053 07832956473 15765969474 0150000000054 14901074053 07832956473 15765969474 01500000000
No matter how much we try to calculate the value past the 48th iteration thevalues do not change ndash they have converged at the limiting values
In practice theres no need to wait until they do not change at all before we stopcalculations We simply wait until they do not change very much In our exampleabove we have done that to 10 decimal places Ie we have calculated until theydo not change by 00000000001 or more (or mathematically | PR 48 (A) - PR 50(A) | lt 10 ndash10)
18
Derived specifics of how Google calculates PageRank
We started our values from zero and now see that the calculations are fairlycomplicated and resource intensive So two questions arise When Googlecalculates PageRank themselves do they start each month with an arbitraryvalue And secondly do they calculate PageRank across several machines Wecan answer these questions by a couple of simple experiments Assume thathaving calculated the PageRanks in the structure shown earlier we would like toadd a page and change a link to make the structure look like this
If we start each page from an arbitrary starting value of one then how manyiterations of the calculation does it take to achieve convergence [ ExcelExample 2]
75
Now if we start each page from the values we ended up with when we calculatedthe PageRank of the original structure (and set Page Ersquos starting value to 1) thenhow many iterations does it take to achieve convergence [ Excel Example 3]
78
These numbers are really very close Therefore it would appear logical thatGoogle does not reset the values with each calculation as they may simply carryon from where they left off
So do they calculate across several machines or one Obviously there is toomuch work for one machine when you get to billions of pages so several
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
12
There exists a query specific ldquoMinimum PageRank levelrdquo For queries that do nothave heavy competition this level is easy to achieve without even tryingHowever where heavy competition exists Non-PageRank factors are just asimportant (and easier to get) until they reach the Non-PageRank factor thresholdThis is why careful keyword choice can help you avoid the extensive workassociated with highly competitive search phrases
How PageRank is calculated
On a simple level we can tell quite a lot about how PageRank is calculated Thisis because when Google was just a university research project the creators ofPageRank published a paper that detailed a formula for calculating it Thisformula is now more than a handful of years old and we suspect that it haschanged somewhat since then However for detailing the over-riding principlesof how PageRank works it is as accurate today as the day it was written
PR(A) = (1-d) + d (PR(T1) C(T1) + + PR(Tn) C(Tn) )
Where PR(A) is the PageRank of Page A ( the one we want t owork out )
D is a dam pening factor Nom inally this is set to 085
PR(T1) is the PageRank of a site point ing to Page A
C(T1) is the num ber of links off that page
PR(Tn) C(Tn) m eans we do that for each page point ing toPage A
Source The Anatomy of a Large-Scale Hypertextual Web Search EngineSergey Brin and LawrencePage httpwww-dbstanfordedu~backrubgooglehtml
Couldnrsquot be simpler right Depending on your mathematical competence this iseither remarkably easy to understand or remarkably complex So herersquos the dealndash this isnrsquot a one-time calculation It looks pretty above but you canrsquot just do onesimple calculation and get an answer with the PageRank of a page If you look atit to calculate the PageRank of Page A we need to know the PageRank of allthe pages pointing to it To calculate the PageRank of those pages we need toknow the PageRank of all the pages pointing to them (one of which could and isquite likely to be Page A) If this seems like it could go around in circles forever
13
thatrsquos because it nearly does We have to do a whole lot of little calculationsmany times over before we actually get the answer What this formula tells us isthat however you slice it and however the formula may have changed since itwas written
The PageRank given to Page A by a Page B pointing to it is decreased witheach link to anywhere that exists on Page B This means that a pagersquos
PageRank is essentially a measure of its vote it can split that vote between onelink or two or many more but its overall voting power will always remain the
same
This really is important So to be totally clear letrsquos put some numbers to it (thesenumbers are made up for demonstration purposes only and have no relevancyto any particular page) Say Page B has a PageRank value of 5 and has a singlelink on it pointing to Page A Page Arsquos PageRank is improved by a proportion ofPage Brsquos value of 5 (Page B doesnrsquot lose anything but Page A gains) If Page Bhas two links that PageRank improvement would be split and Page A wouldonly gain half the PageRank that it did before
Now put the formula out of your mind for a moment as its easier to understandhow it works using a diagram Letrsquos say we have a hypothetical set of pagesimaginatively titled Page A Page B Page C and Page D They link to each otheras shown below
To begin with in our example at least we donrsquot know what the pagersquos startingPageRanks are Therersquos nothing special about the number that we pick to startwith (in fact if you read forward to the section on convergence ndash you can see that
14
we can start at any number we want) Since in the last version of this paper weperformed the calculations by setting these values to 1 ndash wersquore going to set themto zero this time around in order to prove that it doesnrsquot matter what the startingvalues are
Next we perform the necessary calculation to obtain the PageRank for eachpage The rules are
1 We take a 085 a pagersquos PageRank and divide it by the number oflinks on the page
2 We add that amount on to a new total for each page itrsquos being passedto
3 We add 015 to each of those totals
The first calculation is easy Because wersquove started at zero ndash 0 085 is always 0So each page gets just 015 + 0 Meaning each page now has a PageRank of015 Clearly wersquore not done ndash we want to show the importance of each pagebased upon links and theyrsquore all the same so we need to run the calculationagain
Page A links to pages B C and D Page Arsquos PageRank is 015 so it will add 085 015 = 01275 to the new PageRank scores of the pages it links to There arethree of them so they each get 00425
Page B links to page C Page Brsquos PageRank is 015 so it will add 085 015 =01275 to the new PageRank score of the pages it links to Since it only links topage C page C will get it all
Page C links to Page A all 01275 passes to page A
Page D links to Page C Again all 01275 passes to page C
The new totals for each page them become
Page A 015 (base) + 01275 (from Page C) = 02775Page B 015 (base) + 00425 (from Page A) = 01925Page C 015 (base) + 00425 (from Page A) + 01275 (from Page B) + 01275(from Page D) = 04475Page D 015 (base) + 00425 (from Page A) = 01925
So wersquove got
15
Pretty neat huh Already wersquore begining to see that Page C is probably the mostimportant page in the system (but we canrsquot be sure yet ndash it could well change)We carry on doing these calculations until the value for each page no longerchanges (this is called convergence ndash therersquos more about this in the nextsection) In practice Google probably doesnt wait for this convergence butinstead run a number of iterations of the calculation which is likely to give themfairly accurate values (more on this later as well) If we carried out all thecalculations for the example given it would take us 143 calculations [ ExcelExample 1] and wersquod reach final values of
16
As suspected Page C is the most important If we take a quick look at these rawvalues we can see something about the number of links pointing out from a pageLook at Page A which has a link from a high PageRank page (Page C) whichhas only one outbound link Then look at Page B and D both share links from ahigh PageRank page (Page A) with three outbound links The number of linkssignificantly alters the way PageRank is distributed
A word on convergence
Convergence is an important mathematical aspect of PageRank which allowsGoogle to provided unprecedented search quality at comparably low costs Thisis a semi-complex topic but it is important to your understanding of how and whyPageRank works Wersquove tried to make it as simple as possible but unless yoursquoreSergey Brin or Larry Page yoursquoll still need to concentrate
Whilst it will take some concentration it isnrsquot that hard to understand
Wersquove shown that the outlet values (final values) of one stage of the calculationbecome the inlet values (starting values) of the next stage and that we keep ondoing this (itrsquos known as a recursive procedure) But the really big question ishow and when does the recursive procedure stop
The answer is ldquoconvergencerdquo Provided the dampening factor (d in our equation)is less than one then convergence will occur Nominally we set it to 085(because thatrsquos the value mentioned in the Stanford papers)
This convergence basically means that whatever values we start at after runningthe calculation a number of times we will end up with the same final values andthat these values will no longer change if we do further iterations of thecalculation These final values are known as limiting values
Once the limiting values have been reached Google no longer needs to expendprocessing power on calculating the PageRank They can finish there
This is easier to understand with an example Letrsquos take a look at the followingstructure
17
PageRank for pages A B C D at various stages of iteration
A B C DIteration 1 1 1 1
1 10000000000 05750000000 22750000000 015000000002 20837500000 05750000000 11912500000 015000000003 11625625000 10355937500 16518437500 01500000000
19 14900124031 07833688246 15766187723 0150000000020 14901259564 07832552713 15766187723 0150000000021 14901259564 07833035315 15765705121 0150000000046 14901074052 07832956473 15765969475 0150000000047 14901074054 07832956472 15765969474 0150000000048 14901074053 07832956473 15765969474 0150000000049 14901074053 07832956473 15765969474 0150000000050 14901074053 07832956473 15765969474 0150000000051 14901074053 07832956473 15765969474 0150000000052 14901074053 07832956473 15765969474 0150000000053 14901074053 07832956473 15765969474 0150000000054 14901074053 07832956473 15765969474 01500000000
No matter how much we try to calculate the value past the 48th iteration thevalues do not change ndash they have converged at the limiting values
In practice theres no need to wait until they do not change at all before we stopcalculations We simply wait until they do not change very much In our exampleabove we have done that to 10 decimal places Ie we have calculated until theydo not change by 00000000001 or more (or mathematically | PR 48 (A) - PR 50(A) | lt 10 ndash10)
18
Derived specifics of how Google calculates PageRank
We started our values from zero and now see that the calculations are fairlycomplicated and resource intensive So two questions arise When Googlecalculates PageRank themselves do they start each month with an arbitraryvalue And secondly do they calculate PageRank across several machines Wecan answer these questions by a couple of simple experiments Assume thathaving calculated the PageRanks in the structure shown earlier we would like toadd a page and change a link to make the structure look like this
If we start each page from an arbitrary starting value of one then how manyiterations of the calculation does it take to achieve convergence [ ExcelExample 2]
75
Now if we start each page from the values we ended up with when we calculatedthe PageRank of the original structure (and set Page Ersquos starting value to 1) thenhow many iterations does it take to achieve convergence [ Excel Example 3]
78
These numbers are really very close Therefore it would appear logical thatGoogle does not reset the values with each calculation as they may simply carryon from where they left off
So do they calculate across several machines or one Obviously there is toomuch work for one machine when you get to billions of pages so several
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
13
thatrsquos because it nearly does We have to do a whole lot of little calculationsmany times over before we actually get the answer What this formula tells us isthat however you slice it and however the formula may have changed since itwas written
The PageRank given to Page A by a Page B pointing to it is decreased witheach link to anywhere that exists on Page B This means that a pagersquos
PageRank is essentially a measure of its vote it can split that vote between onelink or two or many more but its overall voting power will always remain the
same
This really is important So to be totally clear letrsquos put some numbers to it (thesenumbers are made up for demonstration purposes only and have no relevancyto any particular page) Say Page B has a PageRank value of 5 and has a singlelink on it pointing to Page A Page Arsquos PageRank is improved by a proportion ofPage Brsquos value of 5 (Page B doesnrsquot lose anything but Page A gains) If Page Bhas two links that PageRank improvement would be split and Page A wouldonly gain half the PageRank that it did before
Now put the formula out of your mind for a moment as its easier to understandhow it works using a diagram Letrsquos say we have a hypothetical set of pagesimaginatively titled Page A Page B Page C and Page D They link to each otheras shown below
To begin with in our example at least we donrsquot know what the pagersquos startingPageRanks are Therersquos nothing special about the number that we pick to startwith (in fact if you read forward to the section on convergence ndash you can see that
14
we can start at any number we want) Since in the last version of this paper weperformed the calculations by setting these values to 1 ndash wersquore going to set themto zero this time around in order to prove that it doesnrsquot matter what the startingvalues are
Next we perform the necessary calculation to obtain the PageRank for eachpage The rules are
1 We take a 085 a pagersquos PageRank and divide it by the number oflinks on the page
2 We add that amount on to a new total for each page itrsquos being passedto
3 We add 015 to each of those totals
The first calculation is easy Because wersquove started at zero ndash 0 085 is always 0So each page gets just 015 + 0 Meaning each page now has a PageRank of015 Clearly wersquore not done ndash we want to show the importance of each pagebased upon links and theyrsquore all the same so we need to run the calculationagain
Page A links to pages B C and D Page Arsquos PageRank is 015 so it will add 085 015 = 01275 to the new PageRank scores of the pages it links to There arethree of them so they each get 00425
Page B links to page C Page Brsquos PageRank is 015 so it will add 085 015 =01275 to the new PageRank score of the pages it links to Since it only links topage C page C will get it all
Page C links to Page A all 01275 passes to page A
Page D links to Page C Again all 01275 passes to page C
The new totals for each page them become
Page A 015 (base) + 01275 (from Page C) = 02775Page B 015 (base) + 00425 (from Page A) = 01925Page C 015 (base) + 00425 (from Page A) + 01275 (from Page B) + 01275(from Page D) = 04475Page D 015 (base) + 00425 (from Page A) = 01925
So wersquove got
15
Pretty neat huh Already wersquore begining to see that Page C is probably the mostimportant page in the system (but we canrsquot be sure yet ndash it could well change)We carry on doing these calculations until the value for each page no longerchanges (this is called convergence ndash therersquos more about this in the nextsection) In practice Google probably doesnt wait for this convergence butinstead run a number of iterations of the calculation which is likely to give themfairly accurate values (more on this later as well) If we carried out all thecalculations for the example given it would take us 143 calculations [ ExcelExample 1] and wersquod reach final values of
16
As suspected Page C is the most important If we take a quick look at these rawvalues we can see something about the number of links pointing out from a pageLook at Page A which has a link from a high PageRank page (Page C) whichhas only one outbound link Then look at Page B and D both share links from ahigh PageRank page (Page A) with three outbound links The number of linkssignificantly alters the way PageRank is distributed
A word on convergence
Convergence is an important mathematical aspect of PageRank which allowsGoogle to provided unprecedented search quality at comparably low costs Thisis a semi-complex topic but it is important to your understanding of how and whyPageRank works Wersquove tried to make it as simple as possible but unless yoursquoreSergey Brin or Larry Page yoursquoll still need to concentrate
Whilst it will take some concentration it isnrsquot that hard to understand
Wersquove shown that the outlet values (final values) of one stage of the calculationbecome the inlet values (starting values) of the next stage and that we keep ondoing this (itrsquos known as a recursive procedure) But the really big question ishow and when does the recursive procedure stop
The answer is ldquoconvergencerdquo Provided the dampening factor (d in our equation)is less than one then convergence will occur Nominally we set it to 085(because thatrsquos the value mentioned in the Stanford papers)
This convergence basically means that whatever values we start at after runningthe calculation a number of times we will end up with the same final values andthat these values will no longer change if we do further iterations of thecalculation These final values are known as limiting values
Once the limiting values have been reached Google no longer needs to expendprocessing power on calculating the PageRank They can finish there
This is easier to understand with an example Letrsquos take a look at the followingstructure
17
PageRank for pages A B C D at various stages of iteration
A B C DIteration 1 1 1 1
1 10000000000 05750000000 22750000000 015000000002 20837500000 05750000000 11912500000 015000000003 11625625000 10355937500 16518437500 01500000000
19 14900124031 07833688246 15766187723 0150000000020 14901259564 07832552713 15766187723 0150000000021 14901259564 07833035315 15765705121 0150000000046 14901074052 07832956473 15765969475 0150000000047 14901074054 07832956472 15765969474 0150000000048 14901074053 07832956473 15765969474 0150000000049 14901074053 07832956473 15765969474 0150000000050 14901074053 07832956473 15765969474 0150000000051 14901074053 07832956473 15765969474 0150000000052 14901074053 07832956473 15765969474 0150000000053 14901074053 07832956473 15765969474 0150000000054 14901074053 07832956473 15765969474 01500000000
No matter how much we try to calculate the value past the 48th iteration thevalues do not change ndash they have converged at the limiting values
In practice theres no need to wait until they do not change at all before we stopcalculations We simply wait until they do not change very much In our exampleabove we have done that to 10 decimal places Ie we have calculated until theydo not change by 00000000001 or more (or mathematically | PR 48 (A) - PR 50(A) | lt 10 ndash10)
18
Derived specifics of how Google calculates PageRank
We started our values from zero and now see that the calculations are fairlycomplicated and resource intensive So two questions arise When Googlecalculates PageRank themselves do they start each month with an arbitraryvalue And secondly do they calculate PageRank across several machines Wecan answer these questions by a couple of simple experiments Assume thathaving calculated the PageRanks in the structure shown earlier we would like toadd a page and change a link to make the structure look like this
If we start each page from an arbitrary starting value of one then how manyiterations of the calculation does it take to achieve convergence [ ExcelExample 2]
75
Now if we start each page from the values we ended up with when we calculatedthe PageRank of the original structure (and set Page Ersquos starting value to 1) thenhow many iterations does it take to achieve convergence [ Excel Example 3]
78
These numbers are really very close Therefore it would appear logical thatGoogle does not reset the values with each calculation as they may simply carryon from where they left off
So do they calculate across several machines or one Obviously there is toomuch work for one machine when you get to billions of pages so several
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
14
we can start at any number we want) Since in the last version of this paper weperformed the calculations by setting these values to 1 ndash wersquore going to set themto zero this time around in order to prove that it doesnrsquot matter what the startingvalues are
Next we perform the necessary calculation to obtain the PageRank for eachpage The rules are
1 We take a 085 a pagersquos PageRank and divide it by the number oflinks on the page
2 We add that amount on to a new total for each page itrsquos being passedto
3 We add 015 to each of those totals
The first calculation is easy Because wersquove started at zero ndash 0 085 is always 0So each page gets just 015 + 0 Meaning each page now has a PageRank of015 Clearly wersquore not done ndash we want to show the importance of each pagebased upon links and theyrsquore all the same so we need to run the calculationagain
Page A links to pages B C and D Page Arsquos PageRank is 015 so it will add 085 015 = 01275 to the new PageRank scores of the pages it links to There arethree of them so they each get 00425
Page B links to page C Page Brsquos PageRank is 015 so it will add 085 015 =01275 to the new PageRank score of the pages it links to Since it only links topage C page C will get it all
Page C links to Page A all 01275 passes to page A
Page D links to Page C Again all 01275 passes to page C
The new totals for each page them become
Page A 015 (base) + 01275 (from Page C) = 02775Page B 015 (base) + 00425 (from Page A) = 01925Page C 015 (base) + 00425 (from Page A) + 01275 (from Page B) + 01275(from Page D) = 04475Page D 015 (base) + 00425 (from Page A) = 01925
So wersquove got
15
Pretty neat huh Already wersquore begining to see that Page C is probably the mostimportant page in the system (but we canrsquot be sure yet ndash it could well change)We carry on doing these calculations until the value for each page no longerchanges (this is called convergence ndash therersquos more about this in the nextsection) In practice Google probably doesnt wait for this convergence butinstead run a number of iterations of the calculation which is likely to give themfairly accurate values (more on this later as well) If we carried out all thecalculations for the example given it would take us 143 calculations [ ExcelExample 1] and wersquod reach final values of
16
As suspected Page C is the most important If we take a quick look at these rawvalues we can see something about the number of links pointing out from a pageLook at Page A which has a link from a high PageRank page (Page C) whichhas only one outbound link Then look at Page B and D both share links from ahigh PageRank page (Page A) with three outbound links The number of linkssignificantly alters the way PageRank is distributed
A word on convergence
Convergence is an important mathematical aspect of PageRank which allowsGoogle to provided unprecedented search quality at comparably low costs Thisis a semi-complex topic but it is important to your understanding of how and whyPageRank works Wersquove tried to make it as simple as possible but unless yoursquoreSergey Brin or Larry Page yoursquoll still need to concentrate
Whilst it will take some concentration it isnrsquot that hard to understand
Wersquove shown that the outlet values (final values) of one stage of the calculationbecome the inlet values (starting values) of the next stage and that we keep ondoing this (itrsquos known as a recursive procedure) But the really big question ishow and when does the recursive procedure stop
The answer is ldquoconvergencerdquo Provided the dampening factor (d in our equation)is less than one then convergence will occur Nominally we set it to 085(because thatrsquos the value mentioned in the Stanford papers)
This convergence basically means that whatever values we start at after runningthe calculation a number of times we will end up with the same final values andthat these values will no longer change if we do further iterations of thecalculation These final values are known as limiting values
Once the limiting values have been reached Google no longer needs to expendprocessing power on calculating the PageRank They can finish there
This is easier to understand with an example Letrsquos take a look at the followingstructure
17
PageRank for pages A B C D at various stages of iteration
A B C DIteration 1 1 1 1
1 10000000000 05750000000 22750000000 015000000002 20837500000 05750000000 11912500000 015000000003 11625625000 10355937500 16518437500 01500000000
19 14900124031 07833688246 15766187723 0150000000020 14901259564 07832552713 15766187723 0150000000021 14901259564 07833035315 15765705121 0150000000046 14901074052 07832956473 15765969475 0150000000047 14901074054 07832956472 15765969474 0150000000048 14901074053 07832956473 15765969474 0150000000049 14901074053 07832956473 15765969474 0150000000050 14901074053 07832956473 15765969474 0150000000051 14901074053 07832956473 15765969474 0150000000052 14901074053 07832956473 15765969474 0150000000053 14901074053 07832956473 15765969474 0150000000054 14901074053 07832956473 15765969474 01500000000
No matter how much we try to calculate the value past the 48th iteration thevalues do not change ndash they have converged at the limiting values
In practice theres no need to wait until they do not change at all before we stopcalculations We simply wait until they do not change very much In our exampleabove we have done that to 10 decimal places Ie we have calculated until theydo not change by 00000000001 or more (or mathematically | PR 48 (A) - PR 50(A) | lt 10 ndash10)
18
Derived specifics of how Google calculates PageRank
We started our values from zero and now see that the calculations are fairlycomplicated and resource intensive So two questions arise When Googlecalculates PageRank themselves do they start each month with an arbitraryvalue And secondly do they calculate PageRank across several machines Wecan answer these questions by a couple of simple experiments Assume thathaving calculated the PageRanks in the structure shown earlier we would like toadd a page and change a link to make the structure look like this
If we start each page from an arbitrary starting value of one then how manyiterations of the calculation does it take to achieve convergence [ ExcelExample 2]
75
Now if we start each page from the values we ended up with when we calculatedthe PageRank of the original structure (and set Page Ersquos starting value to 1) thenhow many iterations does it take to achieve convergence [ Excel Example 3]
78
These numbers are really very close Therefore it would appear logical thatGoogle does not reset the values with each calculation as they may simply carryon from where they left off
So do they calculate across several machines or one Obviously there is toomuch work for one machine when you get to billions of pages so several
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
15
Pretty neat huh Already wersquore begining to see that Page C is probably the mostimportant page in the system (but we canrsquot be sure yet ndash it could well change)We carry on doing these calculations until the value for each page no longerchanges (this is called convergence ndash therersquos more about this in the nextsection) In practice Google probably doesnt wait for this convergence butinstead run a number of iterations of the calculation which is likely to give themfairly accurate values (more on this later as well) If we carried out all thecalculations for the example given it would take us 143 calculations [ ExcelExample 1] and wersquod reach final values of
16
As suspected Page C is the most important If we take a quick look at these rawvalues we can see something about the number of links pointing out from a pageLook at Page A which has a link from a high PageRank page (Page C) whichhas only one outbound link Then look at Page B and D both share links from ahigh PageRank page (Page A) with three outbound links The number of linkssignificantly alters the way PageRank is distributed
A word on convergence
Convergence is an important mathematical aspect of PageRank which allowsGoogle to provided unprecedented search quality at comparably low costs Thisis a semi-complex topic but it is important to your understanding of how and whyPageRank works Wersquove tried to make it as simple as possible but unless yoursquoreSergey Brin or Larry Page yoursquoll still need to concentrate
Whilst it will take some concentration it isnrsquot that hard to understand
Wersquove shown that the outlet values (final values) of one stage of the calculationbecome the inlet values (starting values) of the next stage and that we keep ondoing this (itrsquos known as a recursive procedure) But the really big question ishow and when does the recursive procedure stop
The answer is ldquoconvergencerdquo Provided the dampening factor (d in our equation)is less than one then convergence will occur Nominally we set it to 085(because thatrsquos the value mentioned in the Stanford papers)
This convergence basically means that whatever values we start at after runningthe calculation a number of times we will end up with the same final values andthat these values will no longer change if we do further iterations of thecalculation These final values are known as limiting values
Once the limiting values have been reached Google no longer needs to expendprocessing power on calculating the PageRank They can finish there
This is easier to understand with an example Letrsquos take a look at the followingstructure
17
PageRank for pages A B C D at various stages of iteration
A B C DIteration 1 1 1 1
1 10000000000 05750000000 22750000000 015000000002 20837500000 05750000000 11912500000 015000000003 11625625000 10355937500 16518437500 01500000000
19 14900124031 07833688246 15766187723 0150000000020 14901259564 07832552713 15766187723 0150000000021 14901259564 07833035315 15765705121 0150000000046 14901074052 07832956473 15765969475 0150000000047 14901074054 07832956472 15765969474 0150000000048 14901074053 07832956473 15765969474 0150000000049 14901074053 07832956473 15765969474 0150000000050 14901074053 07832956473 15765969474 0150000000051 14901074053 07832956473 15765969474 0150000000052 14901074053 07832956473 15765969474 0150000000053 14901074053 07832956473 15765969474 0150000000054 14901074053 07832956473 15765969474 01500000000
No matter how much we try to calculate the value past the 48th iteration thevalues do not change ndash they have converged at the limiting values
In practice theres no need to wait until they do not change at all before we stopcalculations We simply wait until they do not change very much In our exampleabove we have done that to 10 decimal places Ie we have calculated until theydo not change by 00000000001 or more (or mathematically | PR 48 (A) - PR 50(A) | lt 10 ndash10)
18
Derived specifics of how Google calculates PageRank
We started our values from zero and now see that the calculations are fairlycomplicated and resource intensive So two questions arise When Googlecalculates PageRank themselves do they start each month with an arbitraryvalue And secondly do they calculate PageRank across several machines Wecan answer these questions by a couple of simple experiments Assume thathaving calculated the PageRanks in the structure shown earlier we would like toadd a page and change a link to make the structure look like this
If we start each page from an arbitrary starting value of one then how manyiterations of the calculation does it take to achieve convergence [ ExcelExample 2]
75
Now if we start each page from the values we ended up with when we calculatedthe PageRank of the original structure (and set Page Ersquos starting value to 1) thenhow many iterations does it take to achieve convergence [ Excel Example 3]
78
These numbers are really very close Therefore it would appear logical thatGoogle does not reset the values with each calculation as they may simply carryon from where they left off
So do they calculate across several machines or one Obviously there is toomuch work for one machine when you get to billions of pages so several
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
16
As suspected Page C is the most important If we take a quick look at these rawvalues we can see something about the number of links pointing out from a pageLook at Page A which has a link from a high PageRank page (Page C) whichhas only one outbound link Then look at Page B and D both share links from ahigh PageRank page (Page A) with three outbound links The number of linkssignificantly alters the way PageRank is distributed
A word on convergence
Convergence is an important mathematical aspect of PageRank which allowsGoogle to provided unprecedented search quality at comparably low costs Thisis a semi-complex topic but it is important to your understanding of how and whyPageRank works Wersquove tried to make it as simple as possible but unless yoursquoreSergey Brin or Larry Page yoursquoll still need to concentrate
Whilst it will take some concentration it isnrsquot that hard to understand
Wersquove shown that the outlet values (final values) of one stage of the calculationbecome the inlet values (starting values) of the next stage and that we keep ondoing this (itrsquos known as a recursive procedure) But the really big question ishow and when does the recursive procedure stop
The answer is ldquoconvergencerdquo Provided the dampening factor (d in our equation)is less than one then convergence will occur Nominally we set it to 085(because thatrsquos the value mentioned in the Stanford papers)
This convergence basically means that whatever values we start at after runningthe calculation a number of times we will end up with the same final values andthat these values will no longer change if we do further iterations of thecalculation These final values are known as limiting values
Once the limiting values have been reached Google no longer needs to expendprocessing power on calculating the PageRank They can finish there
This is easier to understand with an example Letrsquos take a look at the followingstructure
17
PageRank for pages A B C D at various stages of iteration
A B C DIteration 1 1 1 1
1 10000000000 05750000000 22750000000 015000000002 20837500000 05750000000 11912500000 015000000003 11625625000 10355937500 16518437500 01500000000
19 14900124031 07833688246 15766187723 0150000000020 14901259564 07832552713 15766187723 0150000000021 14901259564 07833035315 15765705121 0150000000046 14901074052 07832956473 15765969475 0150000000047 14901074054 07832956472 15765969474 0150000000048 14901074053 07832956473 15765969474 0150000000049 14901074053 07832956473 15765969474 0150000000050 14901074053 07832956473 15765969474 0150000000051 14901074053 07832956473 15765969474 0150000000052 14901074053 07832956473 15765969474 0150000000053 14901074053 07832956473 15765969474 0150000000054 14901074053 07832956473 15765969474 01500000000
No matter how much we try to calculate the value past the 48th iteration thevalues do not change ndash they have converged at the limiting values
In practice theres no need to wait until they do not change at all before we stopcalculations We simply wait until they do not change very much In our exampleabove we have done that to 10 decimal places Ie we have calculated until theydo not change by 00000000001 or more (or mathematically | PR 48 (A) - PR 50(A) | lt 10 ndash10)
18
Derived specifics of how Google calculates PageRank
We started our values from zero and now see that the calculations are fairlycomplicated and resource intensive So two questions arise When Googlecalculates PageRank themselves do they start each month with an arbitraryvalue And secondly do they calculate PageRank across several machines Wecan answer these questions by a couple of simple experiments Assume thathaving calculated the PageRanks in the structure shown earlier we would like toadd a page and change a link to make the structure look like this
If we start each page from an arbitrary starting value of one then how manyiterations of the calculation does it take to achieve convergence [ ExcelExample 2]
75
Now if we start each page from the values we ended up with when we calculatedthe PageRank of the original structure (and set Page Ersquos starting value to 1) thenhow many iterations does it take to achieve convergence [ Excel Example 3]
78
These numbers are really very close Therefore it would appear logical thatGoogle does not reset the values with each calculation as they may simply carryon from where they left off
So do they calculate across several machines or one Obviously there is toomuch work for one machine when you get to billions of pages so several
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
17
PageRank for pages A B C D at various stages of iteration
A B C DIteration 1 1 1 1
1 10000000000 05750000000 22750000000 015000000002 20837500000 05750000000 11912500000 015000000003 11625625000 10355937500 16518437500 01500000000
19 14900124031 07833688246 15766187723 0150000000020 14901259564 07832552713 15766187723 0150000000021 14901259564 07833035315 15765705121 0150000000046 14901074052 07832956473 15765969475 0150000000047 14901074054 07832956472 15765969474 0150000000048 14901074053 07832956473 15765969474 0150000000049 14901074053 07832956473 15765969474 0150000000050 14901074053 07832956473 15765969474 0150000000051 14901074053 07832956473 15765969474 0150000000052 14901074053 07832956473 15765969474 0150000000053 14901074053 07832956473 15765969474 0150000000054 14901074053 07832956473 15765969474 01500000000
No matter how much we try to calculate the value past the 48th iteration thevalues do not change ndash they have converged at the limiting values
In practice theres no need to wait until they do not change at all before we stopcalculations We simply wait until they do not change very much In our exampleabove we have done that to 10 decimal places Ie we have calculated until theydo not change by 00000000001 or more (or mathematically | PR 48 (A) - PR 50(A) | lt 10 ndash10)
18
Derived specifics of how Google calculates PageRank
We started our values from zero and now see that the calculations are fairlycomplicated and resource intensive So two questions arise When Googlecalculates PageRank themselves do they start each month with an arbitraryvalue And secondly do they calculate PageRank across several machines Wecan answer these questions by a couple of simple experiments Assume thathaving calculated the PageRanks in the structure shown earlier we would like toadd a page and change a link to make the structure look like this
If we start each page from an arbitrary starting value of one then how manyiterations of the calculation does it take to achieve convergence [ ExcelExample 2]
75
Now if we start each page from the values we ended up with when we calculatedthe PageRank of the original structure (and set Page Ersquos starting value to 1) thenhow many iterations does it take to achieve convergence [ Excel Example 3]
78
These numbers are really very close Therefore it would appear logical thatGoogle does not reset the values with each calculation as they may simply carryon from where they left off
So do they calculate across several machines or one Obviously there is toomuch work for one machine when you get to billions of pages so several
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
18
Derived specifics of how Google calculates PageRank
We started our values from zero and now see that the calculations are fairlycomplicated and resource intensive So two questions arise When Googlecalculates PageRank themselves do they start each month with an arbitraryvalue And secondly do they calculate PageRank across several machines Wecan answer these questions by a couple of simple experiments Assume thathaving calculated the PageRanks in the structure shown earlier we would like toadd a page and change a link to make the structure look like this
If we start each page from an arbitrary starting value of one then how manyiterations of the calculation does it take to achieve convergence [ ExcelExample 2]
75
Now if we start each page from the values we ended up with when we calculatedthe PageRank of the original structure (and set Page Ersquos starting value to 1) thenhow many iterations does it take to achieve convergence [ Excel Example 3]
78
These numbers are really very close Therefore it would appear logical thatGoogle does not reset the values with each calculation as they may simply carryon from where they left off
So do they calculate across several machines or one Obviously there is toomuch work for one machine when you get to billions of pages so several
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
19
machines must be used Logically how can they do this Assume Pages A andC are on one machine and Pages B C and D are on another If we ignore thelinks between pages across machines (ie A-gtD and D-gtC) then for Pages Aand C it takes
1 calculation to converge at A=1 C=1
Then for Pages B D and E it takes
3 calculations to converge at B=015 D=03954375000 E=03954375000
If we then do calculations for all pages using these as starting values it takes
A huge 146 iterations to converge [ Excel Example 4]
The conclusion is that splitting PageRank calculations up into chunks for differentmachines is not a viable proposition Therefore they must also track PageRanktransfers across links between machines This means that each machine mustnotify the other machines as PageRank is given to a page not on that machineIndeed each machine must also run the iterations in sync with each other Evenif they donrsquot do it exactly this way the calculation of PageRank for so manypages is very very complicated indeed
PageRank Feedback how linking sometimes helps
When the new term PageRank Feedback was presented in the original version ofthis paper it was largely taken up and used ndash although not always correctlyPageRank Feedback is difficult to understand because it is a concept that impliescreation of PageRank However it is really an alteration of distribution ofPageRank on a large scale PageRank Feedback is the principle that explainshow sometimes it is good to link to outside pages Consider we have a Page -gtPage A Its PageRank by definition will be 015 Now assume it links out toPage B which in return links back to it Page Arsquos PageRank now becomes 1Now assume it links out to Page C and Page C links back to it Page ArsquosPageRank now becomes 14594594595
The very fact that Page A links out to a page that links back to it causes its ownPageRank to rise (This could also occur if Page A links to a page that links to apage that links to Page A etchellip) This has occurred not because we aregenerating PageRank but because it is taking it from the system as a whole Butif we view this very small system ldquoPage Ardquo (which is a subsystem of all the pagesin the index) then we can firmly say that linking out from Page A has generatedPageRank in this system ndash that is the very act of linking out has fed PageRankback in
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
20
Letrsquos put this diagrammatically using a slightly more complicated system
Pages A to E are all of the pages in Googlersquos index Pages A and B are your siteWhen we do the calculations for this set up the final PageRanks for each pageare all 1 [ Excel Example 5] As far as your site is concerned
Page Arsquos PageRank is 1
Page Brsquos PageRank is 1
The sum of the PageRank of all the pages in your site is2
Changing the structure to link A to C and C to A gives the following finalPageRanks [ Excel Example 6]
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
21
Page Arsquos PageRank is 13599321536
Page Brsquos PageRank is 07279711653
The sum of the PageRank of all the pages in your site is20879033189
This increase is small the amount will of course vary depending oncircumstances But the point was easily proved ndash linking out can sometimesimprove your PageRank through a mechanism known as PageRank Feedback
If we added an extra page into the loop between C -gt D and E then the totalincrease in PageRank in your site would be 21462030505 [ Excel Example 7]This suggests that it is wiser to reciprocal link with larger sites that employ a linkstructure where a higher amount of focus is given to the page on which your linkappears (or will appear) The number of pages and the structure you are linkingto has significant influence on the amount of PageRank feedback that occurs
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
22
Influencing the results
The question that arises from all this is how and when can or will Googleinfluence the results of the PageRank calculation In this respect the situationhas changed significantly since the original PageRank paper Google has shownthat they can and will modify the data on which PageRank is based The primaryexample of this is what has become known as PageRank Zero (PR0) Basicallyspeculation says that when Google wants to penalize a page it is assigned aPageRank of zero As PageRank is a multiplier this will obviously always listPR0 pages as the very last entry in the SERPS
Your score from other factors multiplied by your PageRank score of zero willalways be zero (the lowest score possible)
How they do this is important Letrsquos say they set the PageRank of the page tozero before doing the calculations ndash would this work From what we know so farwe would think not [ Excel Example 8] And true enough it doesnt Why is thissignificant Because the PageRank must be set to zero at the end of theprocess Having a PageRank of zero in itself does not disqualify a page fromvoting exactly as it would have done in the first place
To stop its voting power the second penalty must also be applied This is thesame penalty that is applied to link farms Google has shown that they arecapable of ignoring links they believe have been artificially created This is simpleenough they merely remove the link entry from the matrix This second penaltywould also need to be applied to a PR0 penalty to stop the links from that pagefrom counting in the PageRank process However there doesnrsquot appear to beany way to test this I suspect that both penalties would be applied
Controlling PageRank
As Webmasters contrary to the suggestions of some we have a fair amount ofcontrol over PageRank It is however the most difficult of ranking factors tocontrol and you might well be able to achieve the desired effects by using othertechniques However a well-worked PageRank added to your optimization mixcan make it difficult for competitors to catch up
There are three fundamental areas to look at when trying to optimize thePageRank for your site
1 The links you choose to have link to you ie which ones you chooseand how much effort you put in to getting them
2 Who you choose to link out to from your site and from which page ofyour site you place their link (Maximising PageRank Feedback andminimising PageRank leakage)
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
23
3 The internal navigational structure and linkage of your pages in order tobest distribute PageRank within your site
Links to Your Site
When looking for links to your site from a purely PageRank point of view onemight think you should simply look for pages that have the highest ToolbarPageRank (Whilst keeping in mind that every page of a site has its ownPageRank so you must consider the PageRank of the links page or whateverpage the actual link will exist on) However this way of thinking is incorrect Ifyoursquove not just jumped to this section then yoursquoll probably have worked out whythat is The PageRank given by a link is far more complicated than thissimplification There may have been a time when that was an okayapproximationhellipbut no more As more and more people try to get links from onlyhigh PageRank sites it becomes less and less of a winning proposition
The actual PageRank from an individual page is shared out amongst the links onthat page (remember the PageRank calculations) So links from pages thathave the same PageRank arent always created equal It depends on how manyother links your link is sharing the links page with For instance a link from apage with a PageRank of 4 might be better than a link from a page with aPageRank of 6 if there are less total links on the PR 4 page Right now there justisnrsquot enough information available to allow us to know to what extent thisstretches However itrsquos significant enough to make it pointless to just choosehigh PageRanked sites as your main linking strategy Theres also another morematter-of-fact reason why that type of linking strategy might not be the best siteswith high PageRanks are often fussy about which sites they will link out tomaking them harder to get linked from than lower PageRanked sites Howeversites struggling with their own PageRank numbers should be more receptive toexchanging reciprocal links with other like-minded sites
Now lets factor in feedback Letrsquos say for example that there are two separatepages on other peoplersquos sites which both have a PageRank of 4 Both of thesehave ten links to other pages But the page on your site that you want them tolink to already has a link to the page on the second site By getting a link fromthe second site yoursquore generating feedback and getting more PageRank than ifyou had gotten a link from the first site Thatrsquos an over simplification in factfeedback loops can get even more complicated Remember the number of linkson the page linking to you will alter the amount of feedback etc
Can you work it all out for a given pagersquos situation No ndash neither can I Myadvice therefore is this ndash get links from sites that seem appropriate and havegood quality regardless of their current PageRank If they are relevant to yoursite and are high quality sites they will either help your PageRank now or willdo so in the future To really get your PageRank humming get yourself a listing
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
24
in DMOZ and Yahoo and enjoy the artificially enhanced PageRank that theyprovide
Links out of your site
When considering links out of your site there is one golden rule
Generally you will want to keep PageRank within your own site
This does not mean that you will lose PageRank from your site by linking out butthat the total PageRank within it may be lower than it could have been had younot linked out From this we can derive another rule which gives away as littlePageRank as possible (meaning that we are giving our own pages as muchPageRank as possible)
The best outbound link PageRank scenario occurs whenthe outbound link comes from a page that has both
a) A low PageRank
and
b) A lot of links to pages on your site
How can this best be achieved One way would be by writing reviews of thesites we link out to on a separate page of our site and by providing a link tothose reviews along with each hyperlink to the external site Optionally it wouldbe okay if these pages open in another window but DO NOT open these pagesusing javascript because the spiders cannot yet follow this
For example we can do something like this with each link to an external sitelta href=httpwwwsupportforumsorggtThe best search engine resource andforum site in the worldltagt lta href=rdquosfreviewhtmrdquo target=rdquo_newrdquogtRead myflattering review of them hereltagt
Make sure that the review page also links back to a page in your own site that ishigh up its structure (Itrsquos best if this is your home page but any important pagewill do) By doing this yoursquove significantly reduced the amount of PageRankyoursquove let out of your site Wersquove targeted the distribution of PageRank to thehome page to ensure that less is passed back through your links page (whichwould be a wasted opportunity) and more is put elsewhere in your site Yourlinks page also needs to link to your home page and the other major pages ofyour site However place no other links on the review page (besides the homepage link)
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
25
Itrsquos very good if someone links to your review page so in addition you may letthe site youve linked to know that you have reviewed them ndash it is quite possibleyou will get two links from their site (one to your site and one to the review oftheir site) This is all very complicated in text form so heres a simplified exampleto show the principle and its effect
Doing all the calculations for PageRanks we converge at the following values [Excel Example 9]
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
26
If we perform the same calculations but include the review pages herersquos what weget
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
27
Just to be very clear here (and because in earlier versions of this paper somequestioned the validity of this technique) if we look at only pages A B C and Dthen
Without the Review Pages With the Review PagesThe HomePage PR is09536152797
BCD PR is 04201909959
Total is 22141882674
The HomePage PR is2439718935
BCD PR is 08412536982
Total is 49634800296
Some of this is the power of extra pages some of it is feedback and some of it isdrawing PageRank away from your outbound links pages Fundamentallyhowever adding extra internal links to your outbound pages is one of the most (ifnot the most) important internal factors to improve your PageRank This boostmay not be as much as from getting a new outside link to your site but itrsquos alsomuch easier and more beneficial for your sitersquos readers
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
28
Internal Structures and Linkages
Having spoken about ldquoLinking outrdquo to external sites the next logical progressionis to discuss internal links within your site If we recognise that PageRank isreally about pages having a vote then straight away we can derive one importantthing about internal link structure and PageRank
A page that is in the Google index has a vote however small Thus the morepages you have in the index ndash the more overall vote you are likely to have Orsimply put bigger sites tend to hold a greater total amount of PageRank withintheir site (as they have more pages to work with)
We need to clarify this a bit more To get a high PageRank it is not enough tohave tens of thousands of pages Those pages must also be in Googlersquos indexTo achieve this they must contain enough content for Googlersquos algorithms toconsider them worthy of being added to the index As you develop content foryour site you are also creating more PageRank for your site Itrsquos hard work andyoursquore creating it slowly ndash but if yoursquore also creating pages that people will want tolink to then yoursquore doing yourself two favours at once PageRank from bothdirections Or to put it in very basic terms
The best ldquointernalrdquo thing you can do to build PageRank is to write lots of goodcontent Ensure that pages arenrsquot overly short or overly long and break thecontent into several pages where necessary
There are three different ways in which pages can be interlinked within a site Inpractice sites might use a combination of these Using a combination isfine and normal as long as you understand the different sections and howthey are affecting your PageRank
For the purposes of this document wersquoll view the different linkage structures asseparate entities We have
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
29
Hierarchical
Looping
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
30
Extensive Interlinking
By performing the PageRank calculations for these structures we can see howinternal link structure can distribute PageRank around your site This is criticalbecause whilst at this stage of the document these are closed systems wheninbound and outbound links come into play you will want to be able to controlyour PageRank so that it is in the right places to maximise PageRank feedbackSince we have already extensively shown you the PageRank calculations wersquolljust dive right in with the final value for each structure
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
31
Hierarchical[Excel Example 11]
Total PageRank in this site is 4
Looping[Excel Example 12]
Total PageRank in this site is 4
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
32
Extensive Interlinking[Excel Example 13]
Total PageRank in this site is 4
Notice something that is critically important Because there are no inbound oroutbound links we are restricted to having the same total PageRank across allpages in our example site (4) However introducing a hierarchical-like structurecan alter the distribution of the PageRank (Note it does not have to be a stricthierarchy but must contain more of the properties of the hierarchical structurethan the looping or extensive interlinking) This ability to shift PageRank to thepages that we want it to is the Webmasterrsquos easiest way to manipulatePageRank
In general we can state this as the following rule
It is rare that a site will want to have an entirely linking or an entirely extensivelyinterlinked structure By using elements of a hierarchical structure a Webmasteris able to shift PageRank to the pages of his or her choice If PageRank wasspread out evenly across all pages of the site then the Webmaster is very closeto obtaining maximum benefit
The maximum benefit from PageRank is derived when this methodology isapplied towards pages that want to rank high for highly competitive keywordphrases or towards pages that must compete on a large number of keywordphrases To do this the webmaster must move it away from pages that they donot need to rank as highly or from pages that can rank well on the strength ofkeyword phrases alone
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
33
Closed systems (with no inbound and outbound links) are one thing but letrsquos takea look what happens when we add an inbound and outbound link
Hierarchical[Excel Example 14]
Total PageRank in this site 32220942408
Looping[Excel Example 15]
Total PageRank in this site 26641737744
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
34
Extensive Interlinking[Excel Example 16]
Total PageRank in this site 35683429515
When a system is non-closed as most are then we can say that the followingholds true
The Extensive Interlinking strategy retains the most PageRank within the siteFollowing this is the Hierarchical strategy and lastly the Looping strategy
Please keep in mind that this is just a principle In practice you will be unlikely toextensively interlink 10000 pages You must also choose the appropriatestructure for sub-sections of your site As we add more pages into thehierarchical structure it becomes more successful This is purely because morelinks are being added to the page that contains the outbound link (it is dividedamongst more pages most of which are yours) However as we add morepages we also dilute the PageRank of the home page (which could be importantto you)
What Google says
The best information about PageRank is of course straight from the horsersquosmouth We therefore felt it would be helpful to ask Google themselves aboutPageRank Obviously we realize Google wonrsquot provide us with the exactspecifics about PageRank or comment on the validity of what weve writtenHowever we did want to ask them some general questions to provide us with an
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
35
overview of how they see PageRank - now and in the future We hope this thisshort interview will help debunk a few myths and answer some questions
Chris Can you answer the age old question for us Is PageRank namedafter the fact it is page based or is it named after one of the creatorsGoogle PageRank is named for Larry Page Google co-founder andpresident Products
Chris Does Google view PageRank as being a significant distinguishingfeature from the other search enginesGoogle PageRank is a technology which contributes to the speed andrelevance that differentiate Google from other search engines In additionGoogle employs 100 other algorithms to its ranking formula
Chris Does Google believe that PageRank significantly benefits thequality of its results pages and does it expect this to continue in thefutureGoogle Because of the scalable approach PageRank uses to analyzelinks it will continue to be a significant factor in Googlersquos search results
Chris Several people have been known to try to data mine the PageRankinformation given by the toolbar how does Google feel about thisGoogle Mining PageRank data from the Google Toolbar is againstGooglersquos terms of service
Chris How useful do you think the PageRank information is on thetoolbar for a) normal users b) webmasters c) search engine optimizationprofessionalsGoogle The PageRank information on the Google Toolbar is an estimateand is intended only for informational purposes Many users have found itinteresting and webmasters have been known to use it as a measure oftheir performance However because it is a rough estimate the value tosearch engine optimization professionals is limited
Chris One method that people have been known to try to boostPageRank is by using link farms and guestbooks How is Google likely toreact to a site doing thatGoogle Googlersquos engineers are constantly working to update Googlersquosranking algorithm to prevent manipulation of Googlersquos rankings
Final word to non mathematicians
It feels a little strange having a final word when it isnrsquot the end of the documentBut if you read the table of contents yoursquoll see that wersquove added an advancedsection for the techies mathematicians and Phd students If your goal was just tofind out about PageRank and what you can do about it then youre done
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
36
In summary PageRank is probably the most misunderstood search engineconcepts in existence today There are those who profess to know everythingabout PageRank and there are those who claim to not need to know anythingThere are those who claim PageRank is important and there are those whoclaim it is unimportant There are those who claim Google oversells PageRankand there are those who do not believe this to be the case
The truth of the matter is that nobody except Google can tell you the exactspecifics The authors of this document do not in any way profess to tell youeverything What is presented here are theories based on the evidence availableAll we can do is to derive the principles not exact figures
To quantify the worth of PageRank is an impossible task Under differingcircumstances and differing Webmaster styles it will have a different value WhatPageRank is worth to you is probably an individual thing and you must decidehow it fits into your philosophy and situation You can do well by concentratingheavily on PageRank or by not However you should always keep PageRankand the principles discussed here in the back of your mind when designing sitesJust like you remember to create good Titles and anchor text you shouldremember to consider PageRank
The process of improving your PageRank should always be in symbiosis withyour visitor experience The highest PageRank in the world isnrsquot worth much ifyour visitors canrsquot navigate your site Gaining PageRank should only be donewhilst adding value to your site ndash whether that is through better targeting ofvisitors or improving content
Advanced PageRank topics
As we delve into the advanced topics we move more into the realm ofspeculation Whilst we can be relatively sure that the principles derived earlierare sound for the advanced PageRank discussion we need to place a greateremphasis on assumption
It should be said however that Google has surely thought of everything werediscussing here It is therefore perfectly possible that they have done some ofwhat we suggest below If they have not already then theres a good possibilitythey will in the future
Speeding Up PageRank
It is apparent that Google does not need a 100 accurate value for PageRankso the question arises ndash ldquoCan Google calculate an approximation of the finalvalues without using as many iterationsrdquo The answer appears to be yes Let usexplain In previous calculations we have set the normalisation co-efficient at a
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
37
fixed amount We can vary this co-efficient for each page depending upon thefinal value of the previous computation using the formula
(1-d) PR i (A)
That is we are replacing the (1-d) of the previous formula with the above PR i(A) simply means with the PR deduced by the previous iteration
Letrsquos do this by example Assume we have the hierarchical structure mentionedearlier
With this very simple structure we arrive at the final values like this
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 12550000000 09150000000 09150000000 09150000000 400000000003 24832500000 05055833333 05055833333 05055833333 400000000004 14392375000 08535875000 08535875000 08535875000 400000000005 23266481250 05577839583 05577839583 05577839583 400000000006 15723490938 08092169688 08092169688 08092169688 400000000007 22135032703 05954989099 05954989099 05954989099 400000000008 16685222202 07771592599 07771592599 07771592599 400000000009 21317561128 06227479624 06227479624 06227479624 40000000000
10 17380073041 07539975653 07539975653 07539975653 4000000000011 20726937915 06424354028 06424354028 06424354028 4000000000012 17882102772 07372632409 07372632409 07372632409 4000000000013 20300212644 06566595785 06566595785 06566595785 4000000000014 18244819253 07251726916 07251726916 07251726916 40000000000
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
38
15 19991903635 06669365455 06669365455 06669365455 4000000000016 18506881910 07164372697 07164372697 07164372697 4000000000017 19769150376 06743616541 06743616541 06743616541 4000000000018 18696222180 07101259273 07101259273 07101259273 4000000000019 19608211147 06797262951 06797262951 06797262951 4000000000020 18833020525 07055659825 07055659825 07055659825 4000000000021 19491932554 06836022482 06836022482 06836022482 4000000000022 18931857329 07022714224 07022714224 07022714224 4000000000023 19407921270 06864026243 06864026243 06864026243 4000000000024 19003266921 06998911026 06998911026 06998911026 4000000000025 19347223118 06884258961 06884258961 06884258961 4000000000026 19054860350 06981713217 06981713217 06981713217 4000000000027 19303368702 06898877099 06898877099 06898877099 4000000000028 19092136603 06969287799 06969287799 06969287799 4000000000029 19271683888 06909438704 06909438704 06909438704 4000000000030 19119068696 06960310435 06960310435 06960310435 4000000000031 19248791609 06917069464 06917069464 06917069464 4000000000032 19138527133 06953824289 06953824289 06953824289 4000000000033 19232251937 06922582688 06922582688 06922582688 4000000000034 19152585853 06949138049 06949138049 06949138049 4000000000035 19220302025 06926565992 06926565992 06926565992 4000000000036 19162743279 06945752240 06945752240 06945752240 4000000000037 19211668213 06929443929 06929443929 06929443929 4000000000038 19170082019 06943305994 06943305994 06943305994 4000000000039 19205430284 06931523239 06931523239 06931523239 4000000000040 19175384259 06941538580 06941538580 06941538580 4000000000041 19200923380 06933025540 06933025540 06933025540 4000000000042 19179215127 06940261624 06940261624 06940261624 4000000000043 19197667142 06934110953 06934110953 06934110953 4000000000044 19181982929 06939339024 06939339024 06939339024 4000000000045 19195314510 06934895163 06934895163 06934895163 4000000000046 19183982666 06938672445 06938672445 06938672445 4000000000047 19193614734 06935461755 06935461755 06935461755 4000000000048 19185427476 06938190841 06938190841 06938190841 4000000000049 19192386645 06935871118 06935871118 06935871118 4000000000050 19186471352 06937842883 06937842883 06937842883 4000000000051 19191499351 06936166883 06936166883 06936166883 4000000000052 19187225552 06937591483 06937591483 06937591483 4000000000053 19190858281 06936380573 06936380573 06936380573 4000000000054 19187770461 06937409846 06937409846 06937409846 4000000000055 19190395108 06936534964 06936534964 06936534964 4000000000056 19188164158 06937278614 06937278614 06937278614 4000000000057 19190060466 06936646511 06936646511 06936646511 4000000000058 19188448604 06937183799 06937183799 06937183799 4000000000059 19189818686 06936727105 06936727105 06936727105 4000000000060 19188654117 06937115294 06937115294 06937115294 4000000000061 19189644001 06936785333 06936785333 06936785333 4000000000062 19188802599 06937065800 06937065800 06937065800 4000000000063 19189517791 06936827403 06936827403 06936827403 4000000000064 19188909878 06937030041 06937030041 06937030041 4000000000065 19189426604 06936857799 06936857799 06936857799 40000000000
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
39
66 19188987387 06937004204 06937004204 06937004204 4000000000067 19189360721 06936879760 06936879760 06936879760 4000000000068 19189043387 06936985538 06936985538 06936985538 4000000000069 19189313121 06936895626 06936895626 06936895626 4000000000070 19189083847 06936972051 06936972051 06936972051 4000000000071 19189278730 06936907090 06936907090 06936907090 4000000000072 19189113080 06936962307 06936962307 06936962307 4000000000073 19189253882 06936915373 06936915373 06936915373 4000000000074 19189134200 06936955267 06936955267 06936955267 4000000000075 19189235930 06936921357 06936921357 06936921357 4000000000076 19189149459 06936950180 06936950180 06936950180 4000000000077 19189222959 06936925680 06936925680 06936925680 4000000000078 19189160484 06936946505 06936946505 06936946505 4000000000079 19189213588 06936928804 06936928804 06936928804 4000000000080 19189168450 06936943850 06936943850 06936943850 4000000000081 19189206817 06936931061 06936931061 06936931061 4000000000082 19189174205 06936941932 06936941932 06936941932 4000000000083 19189201926 06936932691 06936932691 06936932691 4000000000084 19189178363 06936940546 06936940546 06936940546 4000000000085 19189198391 06936933870 06936933870 06936933870 4000000000086 19189181367 06936939544 06936939544 06936939544 4000000000087 19189195838 06936934721 06936934721 06936934721 4000000000088 19189183538 06936938821 06936938821 06936938821 4000000000089 19189193993 06936935336 06936935336 06936935336 4000000000090 19189185106 06936938298 06936938298 06936938298 4000000000091 19189192660 06936935780 06936935780 06936935780 4000000000092 19189186239 06936937920 06936937920 06936937920 4000000000093 19189191697 06936936101 06936936101 06936936101 4000000000094 19189187058 06936937647 06936937647 06936937647 4000000000095 19189191001 06936936333 06936936333 06936936333 4000000000096 19189187649 06936937450 06936937450 06936937450 4000000000097 19189190498 06936936501 06936936501 06936936501 4000000000098 19189188077 06936937308 06936937308 06936937308 4000000000099 19189190135 06936936622 06936936622 06936936622 40000000000
100 19189188385 06936937205 06936937205 06936937205 40000000000101 19189189872 06936936709 06936936709 06936936709 40000000000102 19189188608 06936937131 06936937131 06936937131 40000000000103 19189189683 06936936772 06936936772 06936936772 40000000000104 19189188770 06936937077 06936937077 06936937077 40000000000105 19189189546 06936936818 06936936818 06936936818 40000000000106 19189188886 06936937038 06936937038 06936937038 40000000000107 19189189447 06936936851 06936936851 06936936851 40000000000108 19189188970 06936937010 06936937010 06936937010 40000000000109 19189189375 06936936875 06936936875 06936936875 40000000000110 19189189031 06936936990 06936936990 06936936990 40000000000111 19189189324 06936936892 06936936892 06936936892 40000000000112 19189189075 06936936975 06936936975 06936936975 40000000000113 19189189286 06936936905 06936936905 06936936905 40000000000114 19189189107 06936936964 06936936964 06936936964 40000000000115 19189189259 06936936914 06936936914 06936936914 40000000000116 19189189130 06936936957 06936936957 06936936957 40000000000
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
40
117 19189189240 06936936920 06936936920 06936936920 40000000000118 19189189146 06936936951 06936936951 06936936951 40000000000119 19189189226 06936936925 06936936925 06936936925 40000000000120 19189189158 06936936947 06936936947 06936936947 40000000000121 19189189216 06936936928 06936936928 06936936928 40000000000122 19189189167 06936936944 06936936944 06936936944 40000000000123 19189189208 06936936931 06936936931 06936936931 40000000000124 19189189173 06936936942 06936936942 06936936942 40000000000125 19189189203 06936936932 06936936932 06936936932 40000000000126 19189189177 06936936941 06936936941 06936936941 40000000000127 19189189199 06936936934 06936936934 06936936934 40000000000128 19189189181 06936936940 06936936940 06936936940 40000000000129 19189189196 06936936935 06936936935 06936936935 40000000000130 19189189183 06936936939 06936936939 06936936939 40000000000131 19189189194 06936936935 06936936935 06936936935 40000000000132 19189189185 06936936938 06936936938 06936936938 40000000000133 19189189193 06936936936 06936936936 06936936936 40000000000134 19189189186 06936936938 06936936938 06936936938 40000000000135 19189189192 06936936936 06936936936 06936936936 40000000000136 19189189187 06936936938 06936936938 06936936938 40000000000137 19189189191 06936936936 06936936936 06936936936 40000000000138 19189189188 06936936937 06936936937 06936936937 40000000000139 19189189191 06936936936 06936936936 06936936936 40000000000140 19189189188 06936936937 06936936937 06936936937 40000000000141 19189189190 06936936937 06936936937 06936936937 40000000000142 19189189188 06936936937 06936936937 06936936937 40000000000143 19189189190 06936936937 06936936937 06936936937 40000000000144 19189189189 06936936937 06936936937 06936936937 40000000000145 19189189190 06936936937 06936936937 06936936937 40000000000146 19189189189 06936936937 06936936937 06936936937 40000000000147 19189189190 06936936937 06936936937 06936936937 40000000000148 19189189189 06936936937 06936936937 06936936937 40000000000
[Excel Example 17]
As you can see it takes 148 iterations Now if we recalculate the normalizationco-efficient at each iteration we get the following
A B C D SumStart 10000000000 10000000000 10000000000 10000000000 40000000000
1 27000000000 04333333333 04333333333 04333333333 400000000002 15100000000 08300000000 08300000000 08300000000 400000000003 23430000000 05523333333 05523333333 05523333333 400000000004 17599000000 07467000000 07467000000 07467000000 400000000005 21680700000 06106433333 06106433333 06106433333 400000000006 18823510000 07058830000 07058830000 07058830000 400000000007 20823543000 06392152333 06392152333 06392152333 400000000008 19423519900 06858826700 06858826700 06858826700 400000000009 20403536070 06532154643 06532154643 06532154643 40000000000
10 19717524751 06760825083 06760825083 06760825083 4000000000011 20197732674 06600755775 06600755775 06600755775 4000000000012 19861587128 06712804291 06712804291 06712804291 40000000000
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
41
13 20096889010 06634370330 06634370330 06634370330 4000000000014 19932177693 06689274102 06689274102 06689274102 4000000000015 20047475615 06650841462 06650841462 06650841462 4000000000016 19966767069 06677744310 06677744310 06677744310 4000000000017 20023263051 06658912316 06658912316 06658912316 4000000000018 19983715864 06672094712 06672094712 06672094712 4000000000019 20011398895 06662867035 06662867035 06662867035 4000000000020 19992020773 06669326409 06669326409 06669326409 4000000000021 20005585459 06664804847 06664804847 06664804847 4000000000022 19996090179 06667969940 06667969940 06667969940 4000000000023 20002736875 06665754375 06665754375 06665754375 4000000000024 19998084188 06667305271 06667305271 06667305271 4000000000025 20001341069 06666219644 06666219644 06666219644 4000000000026 19999061252 06666979583 06666979583 06666979583 4000000000027 20000657124 06666447625 06666447625 06666447625 4000000000028 19999540013 06666819996 06666819996 06666819996 4000000000029 20000321991 06666559336 06666559336 06666559336 4000000000030 19999774607 06666741798 06666741798 06666741798 4000000000031 20000157775 06666614075 06666614075 06666614075 4000000000032 19999889557 06666703481 06666703481 06666703481 4000000000033 20000077310 06666640897 06666640897 06666640897 4000000000034 19999945883 06666684706 06666684706 06666684706 4000000000035 20000037882 06666654039 06666654039 06666654039 4000000000036 19999973483 06666675506 06666675506 06666675506 4000000000037 20000018562 06666660479 06666660479 06666660479 4000000000038 19999987007 06666670998 06666670998 06666670998 4000000000039 20000009095 06666663635 06666663635 06666663635 4000000000040 19999993633 06666668789 06666668789 06666668789 4000000000041 20000004457 06666665181 06666665181 06666665181 4000000000042 19999996880 06666667707 06666667707 06666667707 4000000000043 20000002184 06666665939 06666665939 06666665939 4000000000044 19999998471 06666667176 06666667176 06666667176 4000000000045 20000001070 06666666310 06666666310 06666666310 4000000000046 19999999251 06666666916 06666666916 06666666916 4000000000047 20000000524 06666666492 06666666492 06666666492 4000000000048 19999999633 06666666789 06666666789 06666666789 4000000000049 20000000257 06666666581 06666666581 06666666581 4000000000050 19999999820 06666666727 06666666727 06666666727 4000000000051 20000000126 06666666625 06666666625 06666666625 4000000000052 19999999912 06666666696 06666666696 06666666696 4000000000053 20000000062 06666666646 06666666646 06666666646 4000000000054 19999999957 06666666681 06666666681 06666666681 4000000000055 20000000030 06666666657 06666666657 06666666657 4000000000056 19999999979 06666666674 06666666674 06666666674 4000000000057 20000000015 06666666662 06666666662 06666666662 4000000000058 19999999990 06666666670 06666666670 06666666670 4000000000059 20000000007 06666666664 06666666664 06666666664 4000000000060 19999999995 06666666668 06666666668 06666666668 4000000000061 20000000004 06666666665 06666666665 06666666665 4000000000062 19999999998 06666666667 06666666667 06666666667 4000000000063 20000000002 06666666666 06666666666 06666666666 40000000000
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
42
64 19999999999 06666666667 06666666667 06666666667 4000000000065 20000000001 06666666666 06666666666 06666666666 4000000000066 19999999999 06666666667 06666666667 06666666667 4000000000067 20000000000 06666666667 06666666667 06666666667 40000000000
[Excel Example 18]
A very good approximation of the values with only 67 calculations This is knownas the ldquoaccelerated technology of the computation of the Page Rank estimatingvaluesrdquo So by being a little bit clever Google can considerably reduce thecomputation time needed to calculate PageRank
How PageRank could be more than it seems
Therersquos a possibility just a possibility that PageRank could well be more than itseems If we think about the factors that could affect a sitersquos ranking in Googlewe come up with the obvious things such as PageRank keywords anchor textand so on But where do the less obvious factors like ldquoquantity of textrdquo fit Whatif for example some pages were preferred because they had the correctquantity of text This doesnrsquot seem to fit into the traditional factors but it can beworked into PageRank Letrsquos explain
How we distribute the normalization co-efficients is pretty much up to us as longas the total value stays the same So with our previous example we may set theldquoAbout Usrdquo page (Page B) higher and lower the rest of the pages because pageB meets some criteria we define (eg it has the correct amount of text) So wecould set page B to
(04n + 06)015
as long as we set the other pages to
06015
ie [01n + 09+09 (n-1)=n where n is the number of Pages in the Site]
This strongly favours Page B Letrsquos give an example Normally the final valuesfor the previous example site structure would be
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
43
Using our new values for the co-efficients in an attempt to show favour to Page B(that meets our arbitrary criteria) we get
[Excel Example 19]
As you can see wersquove boosted the PageRank of the About Us page at theexpense of all other pages ndash simply because wersquove introduced a new arbitraryfactor called amount of text
PageRank not only considers links but it may also consider certain other factors
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
44
There is of course no reason why we should stop at boosting a single pagersquosPageRank ndash it is just as easy to boost an entire sitersquos PageRank based on anyfactor we like Examples of possible factors are ldquoitrsquos popularrdquo or ldquowe like itrdquo
Favouring pages in directories
It stands to reason that if we can cause an increase in the PageRank of a pageor pages at the expense of those it links to and from then we can also cause adecrease in its PageRank to the benefit of those it links to and from Why wouldwe want to do this Because the page represents a list of links that points tohigh-quality pages Letrsquos pretend that Page B is such a list say a category pagewith links in ODP
Wersquoll set Page Brsquos normalization co-efficient to 006 and the rest to 018 We endup with
[Excel Example 20]
And sure enough it does exactly what we want it to do
This has several advantages The page itself which is actually just a list of linkswill be pushed down in the rankings Whilst the pages it links to will get a boostThus the searcher is less likely to see pages that are just links and more likely tosee good quality pages In addition those with the Google Toolbar installed wonrsquotbe racing each other to get links on such pages because their PageRank willappear deceptively low
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
45
Examples
Throughout this document yoursquove seen the phrase [Excel Example x] This refersto the actual excel spreadsheet that contains the calculations used to producethis example (so you can look at it in more depth if you want to)
You can get the examples from httpwwwsupportforumsorgpagerank
Mathematical aspects of PageRank for advanced readers
The theoretical concept of PageRank interprets Web sites as directedmultigraphs with pages on their tops Googlebot (Googlersquos spider) explores asite traces links between pages and forms the matrix of links which contains allinformation about the topology of that site
Letrsquos write down the elements of the matrix of links Dkm N n) (m=1n)
What does that mean Let us explain the designations
T1 T2hellipTn correspond to Web site pages (n is the number of pages)
Components akm which are on the intersection of the k-line and m-columnin the matrix of links determine the number of links off Tk to Tm
Cr (Tk) is the total number of links off Tk
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
46
DIAGRAM OF THE MATRIX OF LINKS (1)
T1 T2 Tn Num
ber o
f lin
ks o
ffth
e pa
ge
T1 a 11 a 12 hellip hellip a 1n Cr(T1)
T2 a 21 a 22 hellip hellip a 2n Cr(T2)
Tn a n1 a n2 hellip hellip a nn Cr(Tn)
Letrsquos write down the equation for the computation scheme
n
PR I+1 (Tm) = (1-d) + d Σ akmPR I (Tk) Cr(Tk) k=1Where
PRI+1 (Tm) is Page Rank of the Page Tm on (I+1) iteration (the estimating value ofthe page Tm (m=1hellip n) )
PR I (Tk) is the Page Rank of the Page Tk on the I iteration (the estimating valueof the Page Tk (k=1hellip n) )
Cr(Tk) is the total number of links off the page Tk
Cm(Tk) = akm is the number of the individual links off the page Tk to the page Tm(k=1n) (m=1 n)
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
47
d is a dampening factor (usually 085)
(1-d) is a normalization coefficient
n is the number of the Pages on the Site
This equation of a recursive computational scheme allows taking intoconsideration and then computing the estimating values of a Web site payingattention to the probable complexity of links between pages
Take a look at the following example Generally it is a multigraph of a small Website Letrsquos apply PageRank methods to this site
MULTIGRAPH OF A SIMPLE WEB SITE
You may have already noticed that every page of the second subgraph links toevery page of the first subgraph
Letrsquos construct a matrix of links of this site The values A B C D E F G H Irefer to the corresponding pages
A
D
CB
H
IGFE
1 2
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
48
DIAGRAM OF THE MATRIX OF LINKS (2)
A B C D E F G H I Num
ber o
f lin
ks o
ff th
e pa
ge
A 0 1 0 0 0 0 0 0 0 Cr(A) 1
Homeindex B 1 0 1 1 0 0 0 0 0 Cr(B) 3
C 0 1 0 0 0 0 0 0 0 Cr(C) 1
D 0 1 0 0 0 0 0 0 0 Cr(D) 1
E 1 1 1 1 0 1 0 0 0 Cr(E) 5
F 1 1 1 1 0 0 1 0 0 Cr(F 5
G 1 1 1 1 0 1 0 1 1 Cr(G) 7
H 1 1 1 1 1 0 1 0 0 Cr(H) 6
I 1 1 1 1 0 0 1 0 0 Cr(I) 5
Note the wonderful ability of PageRank to distinguish the most important pagesand the relationship of external pages to them Later we will use the term ldquobasepagesrdquo to describe those important pages The main feature of the base pages isthat they are traversed by a directed cycle
Letrsquos apply the so-called accelerated technology of computation of the estimatingvalues of the site pages
Suppose PR i (T) is the estimating value for the page T (T=A B C D E F GH I) page on i-iterationThe normalization coefficient for the PR i+1(T) computation (on (i + 1)-iteration)can be calculated by formula (1-d) PR i (T)
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
49
Applying the accelerated technology the computation of the estimating values ofthe site pages the limiting values of the estimating values of external pages inrelation to the base sites are equal to zero
This approach permits us to distinguish the external pages in relation to the basepages Thus we can introduce the following definition
The external pages in relation to the base pages are ones that have zero limitingvalues on application of the accelerated technology of the computation of theestimating values of the site pages
ACCELERATED COMPUTATION OF PAGERANK for recognition of externallinks
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1450932 3883281 1450932 1450932 0082202 0192500 0254388 0117417 0117417 9000
3 1432087 4396552 1432087 1432087 0028964 0073739 0107478 0048502 0048502 9000
4 1506130 4356932 1506130 1506130 0011216 0029036 0043774 0020326 0020326 9000
5 1478877 4512665 1478877 1478877 0004562 0011577 0017837 0008364 0008364 9000
6 1507936 4455552 1507936 1507936 0001869 0004678 0007251 0003421 0003421 9000
7 1491656 4516630 1491656 1491656 0000765 0001900 0002949 0001394 0001394 9000
8 1504706 4482464 1504706 1504706 0000312 0000773 0001200 0000567 0000567 9000
9 1496244 4509876 1496244 1496244 0000127 0000315 0000488 0000231 0000231 9000
10 1502441 4492110 1502441 1502441 0000052 0000128 0000199 0000094 0000094 9000
11 1498215 4505125 1498215 1498215 0000021 0000052 0000081 0000038 0000038 9000
12 1501219 4496251 1501219 1501219 0000009 0000021 0000033 0000016 0000016 9000
13 1499134 4502559 1499134 1499134 0000003 0000009 0000013 0000006 0000006 9000
14 1500601 4498182 1500601 1500601 0000001 0000004 0000005 0000003 0000003 9000
15 1499577 4501262 1499577 1499577 0000001 0000001 0000002 0000001 0000001 9000
16 1500295 4499112 1500295 1500295 0000000 0000001 0000001 0000000 0000000 9000
17 1499793 4500620 1499793 1499793 0000000 0000000 0000000 0000000 0000000 9000
18 1500145 4499566 1500145 1500145 0000000 0000000 0000000 0000000 0000000 9000
19 1499899 4500304 1499899 1499899 0000000 0000000 0000000 0000000 0000000 9000
20 1500071 4499787 1500071 1500071 0000000 0000000 0000000 0000000 0000000 9000
21 1499950 4500149 1499950 1499950 0000000 0000000 0000000 0000000 0000000 9000
22 1500035 4499896 1500035 1500035 0000000 0000000 0000000 0000000 0000000 9000
23 1499976 4500073 1499976 1499976 0000000 0000000 0000000 0000000 0000000 9000
24 1500017 4499949 1500017 1500017 0000000 0000000 0000000 0000000 0000000 9000
25 1499988 4500036 1499988 1499988 0000000 0000000 0000000 0000000 0000000 9000
26 1500008 4499975 1500008 1500008 0000000 0000000 0000000 0000000 0000000 9000
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
50
27 1499994 4500018 1499994 1499994 0000000 0000000 0000000 0000000 0000000 9000
28 1500004 4499988 1500004 1500004 0000000 0000000 0000000 0000000 0000000 9000
29 1499997 4500009 1499997 1499997 0000000 0000000 0000000 0000000 0000000 9000
30 1500002 4499994 1500002 1500002 0000000 0000000 0000000 0000000 0000000 9000
31 1499999 4500004 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
32 1500001 4499997 1500001 1500001 0000000 0000000 0000000 0000000 0000000 9000
33 1499999 4500002 1499999 1499999 0000000 0000000 0000000 0000000 0000000 9000
34 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
35 1500000 4500001 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
36 1500000 4499999 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
37 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
38 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
39 1500000 4500000 1500000 1500000 0000000 0000000 0000000 0000000 0000000 9000
For the Web site that we have just examined the pages E F G I H are externalin relation to the base pages A B C D
Letrsquos calculate the estimating values for the pages of the Web site examined withthe constant normalization coefficient (1-d) =015 d = 085
CLASSIC COMPUTATION OF PAGERANK
A B C D E F G H I Amount
Itera
tion
1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 1000000 9000
1 1206429 3473095 1206429 1206429 0291667 0441429 0631667 0271429 0271429 9000
2 1419967 3512317 1419967 1419967 0188452 0276286 0309638 0226702 0226702 9000
3 1332416 3958177 1332416 1332416 0182116 0219636 0267624 0187599 0187599 9000
4 1430747 3706925 1430747 1430747 0176577 0213457 0245806 0182497 0182497 9000
5 1353327 3951436 1353327 1353327 0175854 0209866 0243166 0179848 0179848 9000
6 1420726 3752137 1420726 1420726 0175478 0209422 0241730 0179527 0179527 9000
7 1363844 3923590 1363844 1363844 0175433 0209184 0241554 0179353 0179353 9000
8 1412299 3778418 1412299 1412299 0175408 0209155 0241460 0179332 0179332 9000
9 1371139 3901949 1371139 1371139 0175405 0209140 0241448 0179320 0179320 9000
10 1406132 3796985 1406132 1406132 0175404 0209138 0241442 0179319 0179319 9000
11 1376390 3886213 1376390 1376390 0175403 0209137 0241441 0179318 0179318 9000
12 1401671 3810371 1401671 1401671 0175403 0209136 0241441 0179318 0179318 9000
13 1380182 3874838 1380182 1380182 0175403 0209136 0241441 0179318 0179318 9000
14 1398448 3820041 1398448 1398448 0175403 0209136 0241441 0179318 0179318 9000
15 1382922 3866618 1382922 1382922 0175403 0209136 0241441 0179318 0179318 9000
16 1396119 3827028 1396119 1396119 0175403 0209136 0241441 0179318 0179318 9000
17 1384901 3860680 1384901 1384901 0175403 0209136 0241441 0179318 0179318 9000
18 1394436 3832076 1394436 1394436 0175403 0209136 0241441 0179318 0179318 9000
19 1386332 3856389 1386332 1386332 0175403 0209136 0241441 0179318 0179318 9000
20 1393220 3835723 1393220 1393220 0175403 0209136 0241441 0179318 0179318 9000
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
51
21 1387365 3853289 1387365 1387365 0175403 0209136 0241441 0179318 0179318 9000
22 1392342 3838358 1392342 1392342 0175403 0209136 0241441 0179318 0179318 9000
23 1388112 3851049 1388112 1388112 0175403 0209136 0241441 0179318 0179318 9000
24 1391708 3840261 1391708 1391708 0175403 0209136 0241441 0179318 0179318 9000
25 1388651 3849431 1388651 1388651 0175403 0209136 0241441 0179318 0179318 9000
26 1391249 3841637 1391249 1391249 0175403 0209136 0241441 0179318 0179318 9000
27 1389041 3848262 1389041 1389041 0175403 0209136 0241441 0179318 0179318 9000
28 1390918 3842631 1390918 1390918 0175403 0209136 0241441 0179318 0179318 9000
29 1389322 3847417 1389322 1389322 0175403 0209136 0241441 0179318 0179318 9000
30 1390678 3843349 1390678 1390678 0175403 0209136 0241441 0179318 0179318 9000
31 1389526 3846807 1389526 1389526 0175403 0209136 0241441 0179318 0179318 9000
32 1390506 3843867 1390506 1390506 0175403 0209136 0241441 0179318 0179318 9000
33 1389673 3846366 1389673 1389673 0175403 0209136 0241441 0179318 0179318 9000
34 1390381 3844242 1390381 1390381 0175403 0209136 0241441 0179318 0179318 9000
35 1389779 3846048 1389779 1389779 0175403 0209136 0241441 0179318 0179318 9000
36 1390290 3844513 1390290 1390290 0175403 0209136 0241441 0179318 0179318 9000
37 1389856 3845817 1389856 1389856 0175403 0209136 0241441 0179318 0179318 9000
38 1390225 3844709 1390225 1390225 0175403 0209136 0241441 0179318 0179318 9000
39 1389911 3845651 1389911 1389911 0175403 0209136 0241441 0179318 0179318 9000
40 1390178 3844850 1390178 1390178 0175403 0209136 0241441 0179318 0179318 9000
41 1389951 3845531 1389951 1389951 0175403 0209136 0241441 0179318 0179318 9000
42 1390144 3844952 1390144 1390144 0175403 0209136 0241441 0179318 0179318 9000
43 1389980 3845444 1389980 1389980 0175403 0209136 0241441 0179318 0179318 9000
44 1390119 3845026 1390119 1390119 0175403 0209136 0241441 0179318 0179318 9000
45 1390001 3845381 1390001 1390001 0175403 0209136 0241441 0179318 0179318 9000
46 1390102 3845079 1390102 1390102 0175403 0209136 0241441 0179318 0179318 9000
47 1390016 3845336 1390016 1390016 0175403 0209136 0241441 0179318 0179318 9000
48 1390089 3845118 1390089 1390089 0175403 0209136 0241441 0179318 0179318 9000
49 1390027 3845303 1390027 1390027 0175403 0209136 0241441 0179318 0179318 9000
50 1390080 3845146 1390080 1390080 0175403 0209136 0241441 0179318 0179318 9000
51 1390035 3845280 1390035 1390035 0175403 0209136 0241441 0179318 0179318 9000
52 1390073 3845166 1390073 1390073 0175403 0209136 0241441 0179318 0179318 9000
53 1390041 3845263 1390041 1390041 0175403 0209136 0241441 0179318 0179318 9000
54 1390068 3845180 1390068 1390068 0175403 0209136 0241441 0179318 0179318 9000
55 1390045 3845250 1390045 1390045 0175403 0209136 0241441 0179318 0179318 9000
56 1390064 3845191 1390064 1390064 0175403 0209136 0241441 0179318 0179318 9000
57 1390048 3845241 1390048 1390048 0175403 0209136 0241441 0179318 0179318 9000
58 1390062 3845198 1390062 1390062 0175403 0209136 0241441 0179318 0179318 9000
59 1390050 3845235 1390050 1390050 0175403 0209136 0241441 0179318 0179318 9000
60 1390060 3845204 1390060 1390060 0175403 0209136 0241441 0179318 0179318 9000
61 1390051 3845230 1390051 1390051 0175403 0209136 0241441 0179318 0179318 9000
62 1390059 3845208 1390059 1390059 0175403 0209136 0241441 0179318 0179318 9000
63 1390052 3845227 1390052 1390052 0175403 0209136 0241441 0179318 0179318 9000
64 1390058 3845211 1390058 1390058 0175403 0209136 0241441 0179318 0179318 9000
65 1390053 3845224 1390053 1390053 0175403 0209136 0241441 0179318 0179318 9000
66 1390057 3845213 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
67 1390054 3845223 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
68 1390057 3845214 1390057 1390057 0175403 0209136 0241441 0179318 0179318 9000
69 1390054 3845221 1390054 1390054 0175403 0209136 0241441 0179318 0179318 9000
70 1390056 3845215 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
52
71 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
72 1390056 3845216 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
73 1390055 3845220 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
74 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
75 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
76 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
77 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
78 1390056 3845217 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
79 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
80 1390056 3845218 1390056 1390056 0175403 0209136 0241441 0179318 0179318 9000
81 1390055 3845219 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
82 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
83 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
84 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
85 1390055 3845218 1390055 1390055 0175403 0209136 0241441 0179318 0179318 9000
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
Amount
P 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of Page Rank for the pages of the site examined (Logarithmsare to be calculated on base 2)
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
53
Amount
PR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
TheToolbarShows
1 1 1 1 1 1 1 1 1 1
The limiting estimating values for the site pages are darkened
Having divided the estimating values to their total amount which isunchangeable upon execution of the iteration process wersquoll receive the set ofprobability values (the amount of all probability values is equal to one)
AmountP 0154451 0427246 0154451 0154451 0019489 0023237 0026827 0019924 00199241000
Running a parallel demonstration wersquod like to use the famous classicalShannonrsquos formula which was normally used for calculation of the volume of theinformation which an experiment contains and which has n outcomes with theset probabilities
Experiment outcome A1 A2 hellip An
Probabilities p1 p2 hellip pn
H(p1 p2hellip pn) = - p(A1)log p(A1) - p(A2)log p(A2) -hellip- p(An)log p(An)
Letrsquos calculate the values for a set of probability values which correspond to thesite pages and wersquoll receive the following set of the values
This set of values of PageRank for the pages of the site examined (Logarithmsare to be calculated on base 2)
AmountPR 0416211 0524171 0416211 0416211 0110722 0126119 0140040 0112558 0112558 2375
The ToolbarShows 1 1 1 1 1 1 1 1 1 1
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
54
PageRank is a successful representation of the directed multigraph It is widelyknown that multigraphs are used to describe and build self-learning systemswhich model the behavior of animals
Letrsquos assume that every time an incentive is engaged an animal must make achoice eg to turn to left or right So every time an animal behaves as we want itto we can encourage it by applying operator Z+
P n+1 = Z+ (P n) = d P n + (1 ndash d) 0 lt d lt 1 (P ndash the estimating value)
The theory for this subject can be found at the classic work of Bush and Mosteller-- Stochastic Models for Learning Wiley New York 1955
We suppose that it was a variation of Shannonrsquos formula that led to birth ofPageRank
Further Reading and Resources
For those interested in doing further PageRank research we would recommendvisiting the following resources
PageRank Question and Answer Areahttpwwwsupportforumsorgpagerankqampa
Search Engine Marketing ndash the essential best practice guide (an in depth view ofthe other aspects of how search engine work)httpwwwsupportforumsorgpageranksem
PageRank Bringing order to the webhttphcistanfordedu~pagepaperspagerankppframehtm
The anatomy of a Large-Scale HyperTextual Web Search Enginehttpwww-dbstanfordedu~backrubgooglehtml
The first PageRank calculator (with the exception of Googlersquos) by Mark Horrellhttpwwwmarkhorrellcomseopagerankasp
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide
55
Technical Proof Readers credits
The authors of this document welcome comments from experts in the fields ofsearch engines mathematics and search engine optimization We feel thisfeedback is important and provide the following comments and links as a thanksfor the feedback of the people mentioned If yoursquod like to make a comment onthis document then please email chrissupportforumsorg
In the field of information retrieval on the web PageRank has emerged as the primary(and most widely discussed) hyperlink analysis algorithm But how it works is stilllargely a mystery to many in the SEO online community PageRank Uncovered by ChrisRidings and Mike Shishigin is an illuminating document which unveils the mystery andprovides the most detailed in-depth yet easy to follow explanation of the power behindGoogle No matter what level yoursquore at in the field of SEO this is not something youshould read itrsquos something you must read
Mike GrehanAuthorSearch Engine Marketing The essential best practice guide