Saturday, 5 November 2011

Mathematical Logic behind the Google PageRank


A Rough Estimation of How PageRank is Assigned
There are around 200-250 factors PageRank for any webpage depends on. A probability is articulated as a numeric assessment between 0 and 1. A 0.5 probability is normally expressed as a "50% chance" of an event to occur. On similar basis, a PageRank of 0.5 implies there is a 50% chance that a person going to click on a random link will be taken to the document which is having a PageRank of 0.5.Lets consider a small system of four web pages: M, N, O and P. The preliminary estimation of PageRank would be consistently divided between these four pages. Hence, each page will be beginning with an estimated PageRank of 0.25.
Here a simple probability distribution will be used—hence the initial value of 0.25.
If pages N, O and P each only link to M, they will each be award 0.25 PageRank to M. All PageRank PR() in this unsophisticated system will be thus congregating to M because all the links are pointing to M.

                                           PR(M)= PR(N)/2+ PR(O)/1+ PR(P)/3

Alternatively we can say that, the PageRank awarded by an outbound link is equal to the document's own PageRank score divided by the normalized number of outbound links L() (it is assumed that links to specific URLs only count once per document).

                                   PR(M) = PR(N)/L(N) + PR(O)/L(O)+ PR(P)/L(P)

Generally, the PageRank for any page u can be formulated as:


i.e. the PageRank value for a page u will depend upon PageRank value for each page v out of the set Bu (those set which are containing pages linking to page u), divided by the number L(v) of links from page v.

Factors That Influences PageRank:

  • How frequent is the keyword within that Web page: If the keyword only comes out less in the body of a page, it will be receiving a low score for that particular keyword.
  • How long the Web page has been existing: every day new Web pages are being generated, and not all of them persist long. Google places more value on pages with an older age.
  • How many Web pages are linking to the page: Google looks at how many                                                              Web pages link to a particular site to determine its relevance.
  • What is the PageRank and domain of the page linking to this page: Well By domain here I mean whether the page linking is related to the interest of the page or not.
Among these factors, the third is the most significant. Let’s try to understand it with an example. Let's look at a search for the terms “Computer your best friend".As more number of Web pages will be linking to “Computer your best friend" page, the rank will be increasing.When its page ranks higher than the companion pages, it will be showing up at the top of the Google search results page.Google assumes links to a Web page as a vote; it's very much difficult to cheat the Google system of ranking. The best way to ranking high is on Google's search results is to provide the quality and quantity of content so that people will be linking back to your page. The more links your page will get, the higher its PageRank will be. If the pages linking to your sites are with a high PageRank score, the chances of your score rising will be high.
  1. Too complicated for anyone to understand this formula. By the way Nice Post.

    1. Thanks Akshat for stopping by..

      This is the most simple interpretation for Pagerank calculation the actual calculation is way too complex to be understood ...

      However the general logic is simple... build good website and useful content people will link to you and you will get a good pagerank.. :)

