使用python的networkX计算个性化页面排名 [英] Using python's networkX to compute personalized page rank

查看:379
本文介绍了使用python的networkX计算个性化页面排名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试建立有向图,并在该图上计算个性化页面排名.因此,假设我有一个顶点为{1,2,3,4}且边为 2、3和4到顶点1的图形,我想:

I am trying to build a directed graph and compute personalized page rank over this graph. So suppose I have a graph with vertices {1,2,3,4} and edges going from 2, 3, and 4 to vertex 1, I would like to:

(1)计算每个顶点相对于1的个性化页面等级

(1) compute the personalized page rank of every vertex with respect to 1

(2)计算每个顶点相对于2的个性化页面等级.

(2) compute the personalized page rank of every vertex with respect to 2.

问题是我应该如何在个性化页面排名功能中传递此选项.以下代码似乎无法满足我的要求:

The question is how I should pass this option in the personalized page rank function. The following code does not seem to do what I want:

import networkx as nx

G = nx.DiGraph()

[G.add_node(k) for k in [1,2,3,4]]
G.add_edge(2,1)
G.add_edge(3,1)
G.add_edge(4,1)


ppr1 = nx.pagerank(G,personalization={1:1, 2:1, 3:1, 4:1})
ppr2 = nx.pagerank(G,personalization={1:2, 2:2, 3:2, 4:2})

现在ppr1 == ppr2,即使情况并非如此.

Right now ppr1 == ppr2, even though it should not be the case.

================================================ =================== 更新.

================================================================== UPDATE.

作为对以下评论的回应,我对个性化页面排名的理解来自以下方面:

In response to comment below, my understanding of personalized page rank comes from the following:

等效定义是根据随机游走开始的终端节点 来自.令(X0,X1,...,XL)为从长度X0 = s开始的随机游走 L〜几何(α).这里,由L〜Geometric(α)表示Pr [L = c2>α.这 步行从s开始,并在每个步骤中执行以下操作:以概率α,终止; 并以剩余概率1-α继续与 当前节点.在这里,如果当前节点为u,则随机邻居v∈N out(u)为 如果图是加权的或具有统一的概率,则以概率wu,v选择 1/dout(u)(如果图形未加权).那么任何节点t的PPR就是概率 这次步行在t停止:

An equivalent definition is in terms of the terminal node of a random walk starting from s. Let (X0, X1, . . . , XL) be a random walk starting from X0 = s of length L ∼ Geometric(α). Here by L ∼ Geometric(α) we mean Pr[L = ] = (1−α) α. This walk starts at s and does the following at each step: with probability α, terminate; and with the remaining probability 1 − α, continue to a random out-neighbor of the current node. Here if the current node is u, the random neighbor v ∈ N out(u) is chosen with probability wu,v if the graph is weighted or with uniform probability 1/dout(u) if the graph is unweighted. Then the PPR of any node t is the probability that this walk stops at t:

在本论文的第6页上找到: https://cs.stanford.edu/people/plofgren/bidirectional_ppr_thesis.pdf

Found on page 6 of this thesis: https://cs.stanford.edu/people/plofgren/bidirectional_ppr_thesis.pdf

所以我想我在计算相对于s的t的个性化页面等级"时要寻找的是,如果我们根据上述过程从s开始随机游走,那么该游走终止于的概率是多少t.

So I suppose what I am looking for when computing "the personalized page rank of t with respect to s" is if we start a random walk from s according to the process described above, what is the probability that this walk terminates at t.

推荐答案

在PageRank的概念化中,一个随机的冲浪者在下面的链接中移动.在每个步骤中,浏览者进入随机页面的可能性都为非零(与跟随链接相反).如果对该随机页面的选择进行了加权,则称为个性化PageRank.

In the conceptualization of PageRank, a random surfer is moving around following links. At each step there is a nonzero probability the surfer goes to a random page (as opposed to following a link). If the choice of that random page is weighted, then it is referred to as personalized PageRank.

在您的情况下,您希望跳转到单个特定页面.因此,您需要告诉它,当冲浪者跳跃而不是跟随边缘时,所有其他页面在这些步骤中被选择的可能性为零.

In your case you want that jump to be to a single specific page. So you need to tell it that all the other pages have zero probability of being selected in those steps when the surfer jumps rather than following an edge.

ppr1 = nx.pagerank(G,personalization={1:1, 2:0, 3:0, 4:0})
ppr2 = nx.pagerank(G,personalization={1:0, 2:1, 3:0, 4:0})

这篇关于使用python的networkX计算个性化页面排名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆