如何计算每个用户的评分平均值? [英] How to calculate the mean of ratings of each user?
问题描述
假设我有一个像这样的数据集:
userID productID等级
ai 5
bi 4
ci 4
aj 3
bj 5
问题是,我该如何计算每个用户的平均评分?
我看到了
由用户分组并计算每个的均值:
在[2]中:df.groupby('userID')。mean()
您还可以在 df
命名为 user_avg_rating
并为其分配每个用户的平均分数:
在[3]中:df ['user_avg_rating'] = df.groupby('userID')['ratin g']。transform('mean')
df
方法 transform
接受分组的对象并创建一个序列:
在[4]中:df.groupby('userID')['rating']。transform('mean')
0 4.0
1 4.5
2 4.0
3 4.0
4 4.5
dtype:float64
此系列已分配给 user_avg_rating
列。
Assume I have a dataset like this:
userID productID rating
a i 5
b i 4
c i 4
a j 3
b j 5
The question is, how can I calculate the mean rating of each user? I saw this answer, but I didn't quite understand it. I would really appreciate your time, if you show some guidance.
I work in an IPython Notebook.
Let's assume you have this file user_ratings.csv
:
userID productID rating
a i 5
b i 4
c i 4
a j 3
b j 5
The example in the link uses pandas. So import pandas:
In [1]: import pandas as pd
Read your file into a dataframe:
In [2]: df = pd.read_csv('user_ratings.csv', delim_whitespace=True)
df
Group by the user and calculate the mean for each:
In [2]: df.groupby('userID').mean()
You can also create a new column in df
named user_avg_rating
an assign the mean score of each user to it:
In [3]: df['user_avg_rating'] = df.groupby('userID')['rating'].transform('mean')
df
The method transform
takes your grouped object and creates a series:
In [4]: df.groupby('userID')['rating'].transform('mean')
0 4.0
1 4.5
2 4.0
3 4.0
4 4.5
dtype: float64
This series is assigned to the column user_avg_rating
.
这篇关于如何计算每个用户的评分平均值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!