pandas 中的rank方法中的ValueError没有更多解释 [英] ValueError in rank method in pandas without more explanation
问题描述
我有一个这样的熊猫数据框:
I have a pandas Dataframe like this :
year week city avg_rank
0 2016 52 Paris 1
1 2016 52 Gif-sur-Yvette 2
2 2016 52 Paris 1
3 2017 1 Paris 4
4 2016 52 Paris 3
5 2016 52 Paris 5
6 2016 52 Paris 2
但是此代码行:
df['real_index']=df.groupby(by=['year', 'week', 'city']).avg_rank.rank(method='first')
生成该堆栈跟踪:
/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.pyc in rank(self, axis, method, numeric_only, na_option, ascending, pct)
/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.pyc in wrapper(*args, **kwargs)
590 *args, **kwargs)
591 except(AttributeError):
592 raise ValueError
593
594 return wrapper
ValueError:
我的DataFrame的这些列中没有NaN
值.
I have no NaN
value in those columns of my DataFrame.
我正在将python2.7
与pandas 0.18.1
和numpy 1.11.0
一起使用.
I am using python2.7
along with pandas 0.18.1
and numpy 1.11.0
.
我的DataFrame的形状由大约9.000.000行和15列组成.
The shape of my DataFrame is consisting of about 9.000.000 rows and 15 columns.
更有趣的是,当我在DataFrame的所有子集中执行此代码行时(对于1.000.000行的每个子集),我不会引发任何ValueError
.
What is more intriguing is that when I execute this code line in all subsets of my DataFrame (for each subset of 1.000.000 rows), I don't raise any ValueError
.
是pandas
的已知行为不能很好地处理很大的DataFrame还是我错过了某些事情?
Is that a known behavior that pandas
does not manage well quite big DataFrame or did I miss something ?
欢迎任何帮助!
推荐答案
由于我的DataFrame来自多个文件,因此我注意到某些索引已重复.
Since my DataFrame came from several files, I noticed that some indexes were duplicated.
使用
df.index = np.arange(df.shape[0])
加载数据后,它现在可以工作了.
just after loading the data, it now works.
的确,我的假设是在groupby中的某些组中有时存在具有相同索引的行.
Indeed, my hypothesis is that in some groups in the groupby there were sometimes rows with same indexing.
当我尝试使用DataFrame的子集时,这种情况幸运/不幸的是从未发生过.
When I tried with subsets of my DataFrame, this case fortunately/unfortunately never happened.
但是,错误消息并不十分详尽.
However, the error message is not very exhaustive.
这篇关于 pandas 中的rank方法中的ValueError没有更多解释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!