Slide 7.8: Dangling node fix

Dangling Node Fix

Several options exist for modeling the behavior of a random web surfer after landing on a dangling node. One option replaces each dangling node row of H by the same probability distribution vector, w, a vector with nonnegative elements that sum to 1. The resulting matrix is S=H+dw, where

d is a column vector that identifies dangling nodes, meaning d_i=1 if l_i=0 and d_i=0, otherwise; and

w=(w₁ w₂ . . . w_n) is a row vector with w_j≥0 for all 1≤j≤n and Σw_j=1 where j=1..n.

The most popular choice for w is the uniform row vector, w=(1/n 1/n ... 1/n).

This amounts to adding artificial links from dangling nodes to all webpages. With w=(1/4 1/4 ... 1/4), the directed graph in the previous figure changes to the figure on the right.

The new matrix S=H+dw is given below. Regardless of the option chosen to deal with dangling nodes, Google creates a new matrix S that models the tendency of random web surfers to leave a dangling node; however, the model is not yet complete. Even when webpages have links to other webpages, a random web surfer might grow tired of continually selecting links and decide to move to a different webpage some other way.

For the above graph, there is no directed edge from node 2 to node 1. On the Web, though, a surfer can move directly from node 2 to node 1 by entering the URL for node 1 in the address line of a web browser. The matrix S does not consider this possibility.