Python Pandas Dataframe sort_values不起作用 [英] Python pandas dataframe sort_values does not work
问题描述
我有以下要根据"test_type"排序的熊猫数据框
I have the following pandas data frame which I want to sort by 'test_type'
test_type tps mtt mem cpu 90th
0 sso_1000 205.263559 4139.031090 24.175933 34.817701 4897.4766
1 sso_1500 201.127133 5740.741266 24.599400 34.634209 6864.9820
2 sso_2000 203.204082 6610.437558 24.466267 34.831947 8005.9054
3 sso_500 189.566836 2431.867002 23.559557 35.787484 2869.7670
我用于加载数据框并对其进行排序的代码,第一条打印行在上面打印数据框.
My code to load the dataframe and sort it is, the first print line prints the data frame above.
df = pd.read_csv(file) #reads from a csv file
print df
df = df.sort_values(by=['test_type'], ascending=True)
print '\nAfter sort...'
print df
排序并打印数据框内容后,数据框仍然如下所示.
After doing the sort and printing the dataframe content, the data frame still looks like below.
程序输出:
After sort...
test_type tps mtt mem cpu 90th
0 sso_1000 205.263559 4139.031090 24.175933 34.817701 4897.4766
1 sso_1500 201.127133 5740.741266 24.599400 34.634209 6864.9820
2 sso_2000 203.204082 6610.437558 24.466267 34.831947 8005.9054
3 sso_500 189.566836 2431.867002 23.559557 35.787484 2869.7670
我希望排序后第3行(测试类型:sso_500行)位于最前面.有人可以帮我弄清楚为什么它不能正常工作吗?
I expect row 3 (test type: sso_500 row) to be on top after sorting. Can someone help me figure why it's not working as it should?
推荐答案
概括地说,您要执行的操作是按sso_
之后的数值排序.您可以按照以下步骤进行操作:
Presumbaly, what you're trying to do is sort by the numerical value after sso_
. You can do this as follows:
import numpy as np
df.ix[np.argsort(df.test_type.str.split('_').str[-1].astype(int).values)
此
-
在
_
将此字符后的内容转换为数值
converts what's after this character to the numerical value
查找根据数值排序的索引
Finds the indices sorted according to the numerical values
根据这些索引对DataFrame重新排序
Reorders the DataFrame according to these indices
示例
In [15]: df = pd.DataFrame({'test_type': ['sso_1000', 'sso_500']})
In [16]: df.sort_values(by=['test_type'], ascending=True)
Out[16]:
test_type
0 sso_1000
1 sso_500
In [17]: df.ix[np.argsort(df.test_type.str.split('_').str[-1].astype(int).values)]
Out[17]:
test_type
1 sso_500
0 sso_1000
这篇关于Python Pandas Dataframe sort_values不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!