This approach analyzes hyperlinks to uncover two types of pages:
Authorities:
provide the best source of information on a given topic.
Hubs:
provide collections of links to authorities.
Finding Authorities and Hubs
This algorithm uses two major steps:
A sampling component,
which constructs a focused collection of several thousand pages likely to be rich in relevant authorities; and
A weight-propagation component,
which determines numerical estimates of hub and authority weights by an iterative procedure.
Authority weight update:
If a page is pointed to by many good hubs, we would like to increase its authority weight xp, for a page p, by the sum of yq over all pages q that link to p.
Hub weight update: In a strictly dual fashion, if a page points to many good authorities, we increase its hub weight.
The notation q→p indicates that q links to p.
Implementation
Collect the t highest-ranked pages for the query q from a search engine.
A root set and a base set
are relatively small, and
are rich in relevant pages.
Additionally, a base set contains most of the strongest authorities.