Hyperlink Exploration


This approach analyzes hyperlinks to uncover two types of pages: Finding Authorities and Hubs
This algorithm uses two major steps:
  1. A sampling component, which constructs a focused collection of several thousand pages likely to be rich in relevant authorities; and

  2. A weight-propagation component, which determines numerical estimates of hub and authority weights by an iterative procedure.

    • Authority weight update: If a page is pointed to by many good hubs, we would like to increase its authority weight xp, for a page p, by the sum of yq over all pages q that link to p.

    • Hub weight update: In a strictly dual fashion, if a page points to many good authorities, we increase its hub weight.

    The notation q→p indicates that q links to p.
Implementation
Collect the t highest-ranked pages for the query q from a search engine. A root set and a base set
  • are relatively small, and
  • are rich in relevant pages.
Additionally, a base set contains most of the strongest authorities.