pandas 按另一列中的值对一列进行排序 [英] pandas sort a column by values in another column
本文介绍了 pandas 按另一列中的值对一列进行排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据集,我想对它进行排序和分配排名.
I have a dataset that I want to sort and assign rank based on it.
假设它有两列,一列是年份,另一列是我要排序的列.
Suppose it has two columns, one is year and the other is the column that I want to sort.
import pandas as pd
data = {'year': pd.Series([2006, 2006, 2007, 2007]),
'value': pd.Series([5, 10, 4, 1])}
df = pd.DataFrame(data)
我想按每年对值"列进行排序,然后对其进行排名.我想拥有的是
I want to sort the column 'value' by each year and then give rank to it. What I would like to have is
data2= {'year': pd.Series([2006, 2006, 2007, 2007]),
'value': pd.Series([10, 5, 4, 1]),
'rank': pd.Series([1, 2, 1, 2]}
df2=pd.DataFrame(data2)
>>> df2
rank value year
0 1 10 2006
1 2 5 2006
2 1 4 2007
3 2 1 2007
推荐答案
您可以先使用groupby
,然后再使用rank
(使用ascending=False
首先获得最大值).您无需在groupby
中进行排序,因为结果将索引到数据帧(性能稍快).
You can use groupby
and then use rank
(with ascending=False
to get the largest values first). You don't need to sort in the groupby
, as the result is indexed to the dataframe (slightly faster performance).
df['yearly_rank'] = df.groupby('year', sort=False)['value'].rank(ascending=False)
>>> df.sort_values(['year', 'yearly_rank'])
value year yearly_rank
1 10 2006 1
0 5 2006 2
2 4 2007 1
3 1 2007 2
这篇关于 pandas 按另一列中的值对一列进行排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文