如何获得趋势的标准化斜率 [英] How to get a normalised slope of a trend

查看:176
本文介绍了如何获得趋势的标准化斜率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在分析社交网络中超过6周的用户到userx的距离.

I am analysing the distances of users to userx over 6 weeks in a social network.

注意:无路径"表示这两个用户尚未被连接(至少是由朋友的朋友).

Note: 'No path' means the two users are not conncted yet (at least by friends of friends).

              week1      week2    week3    week4    week5   week6
user1        No path    No path  No path   No path   3       1
user2        No path    No path  No path     5       3       1
user3         5          4         4         4       4       3
userN         ...

我想看看用户与userx的连接状况如何.

I want to see how well the users connect with userx.

为此,我最初考虑使用回归斜率的值进行解释(即回归斜率越低越好).

For that I initially thought of using the value of regression slope for the interpretation (i.e. the low regression slope, the better it is).

例如;考虑user1user2的回归斜率计算如下.

For example; consider user1 and user2 the regression slope of them are calculated as follows.

用户1:

from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
X = [[5], [6]] #distance available only for week5 and week6
y = [3, 1]
regressor.fit(X, y)
print(regressor.coef_)

输出为-2.

用户2:

from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
X = [[4], [5], [6]] #distance available only for week4, week5 and week6
y = [5, 3, 1]
regressor.fit(X, y)
print(regressor.coef_)

输出为-2.

如您所见,两个用户都获得相同的slope值.但是,user2user1之前一周已与userx连接.因此,应该以某种方式授予user1.

As you can see both the users get same slope value. However, user2 has been connected with userx a week before than user1. Hence, user1 should be awarded someway.

因此,我想知道是否有更好的方法来计算我的问题.

Therefore, I am wondering if there is a better way of calculating my problem.

如果需要,我很乐意提供更多详细信息.

I am happy to provide more details if needed.

推荐答案

好吧,如果您想奖励连接持续时间,则可能需要花一些时间进行计算.最简单/最直接的方法就是将系数乘以时间:

Well, if you want to award for the duration of connection, you probably need to take time into calculations. The easiest/most straightforward way is just to multiply the coefficent by time:

outcome_measure <- regressor.coef_ * length(y)

如果将其除以2,则其在概念上将与曲线下的面积(AUC)相同:

And if you would divide it by 2 it will conceptually be the same as the area under the curve (AUC):

outcome_measure <- (regressor.coef_ * length(y))/2

因此,第一种方法将得到-4和-6,第二种方法将得到-2和-3.

So you would get -4 and -6 with the first method or -2 and -3 with the second.

有点偏离主题,但是如果您使用线性回归进行统计分析(不仅仅是为了获得系数),我可能会添加某种检查以确认其假设是正确的.

Slightly offtopic, but IF you use linear regression for statistical analysis (not just to get coefficent), I probably would add some kind of check to confirm that its assumptions are true.

这篇关于如何获得趋势的标准化斜率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆