为什么'reset_index(drop = True)'函数会不必要地删除列? [英] Why did 'reset_index(drop=True)' function unwantedly remove column?

查看:687
本文介绍了为什么'reset_index(drop = True)'函数会不必要地删除列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个名为 data_match 的熊猫数据框.它包含列"_worker_id","_ unit_id"和标题". (有关此数据框中的某些行,请参见附件的屏幕截图)

I have a Pandas dataframe named data_match. It contains columns '_worker_id', '_unit_id', and 'caption'. (Please see attached screenshot for some of the rows in this dataframe)

比方说,索引列不是按升序排列(我希望索引为0、1、2、3、4 ... n),而我希望它按升序排列.因此,我运行以下函数尝试重置索引列:
data_match = data_match.reset_index(drop = True)

Let's say the index column is not in ascending order (I want the index to be 0, 1, 2, 3, 4...n) and I want it to be in ascending order. So I ran the following function attempting to reset the index column:
data_match=data_match.reset_index(drop=True)

我能够使用Python 3.6获得在我的计算机上返回正确输出的函数.但是,当我的同事使用Python 3.6在他的计算机上运行该功能时,"_ worker_id"列被删除了.

I was able to get the function to return the correct output in my computer using Python 3.6. However, when my coworker ran that function in his computer using Python 3.6, the '_worker_id' column got removed.

这是由于" reset_index "旁边的"(drop = True)"子句引起的吗?但是我不知道为什么它不能在我的计算机上工作,而不能在我的同事的计算机上工作.有人可以建议吗?

Is this due to the '(drop=True)' clause next to 'reset_index'? But I didn't know why it worked in my computer and not in my coworker's computer. Can anybody advise?

推荐答案

俗话说:您的口译员所发生的一切都留在您的 解释器".如果不查看该差异,就无法解释差异. 在两个Python交互式会话中输入命令的完整历史记录.

As the saying goes, "What happens in your interpreter stays in your interpreter". It's impossible to explain the discrepancy without seeing the full history of commands entered into both Python interactive sessions.

但是,可以冒险:

df.reset_index(drop=True) 删除DataFrame的当前索引并将其替换为索引 增加整数.它永远不会删除列.

df.reset_index(drop=True) drops the current index of the DataFrame and replaces it with an index of increasing integers. It never drops columns.

因此,在您的交互式会话中,_worker_id是一列.在你同事的 交互式会话,_worker_id必须是索引级别.

So, in your interactive session, _worker_id was a column. In your co-worker's interactive session, _worker_id must have been an index level.

视觉上的差异可能有些微妙.例如,下面的df具有一个 _worker_id列,而df2具有_worker_id索引级别:

The visual difference can be somewhat subtle. For example, below, df has a _worker_id column while df2 has a _worker_id index level:

In [190]: df = pd.DataFrame({'foo':[1,2,3], '_worker_id':list('ABC')}); df
Out[190]: 
  _worker_id  foo
0          A    1
1          B    2
2          C    3

In [191]: df2 = df.set_index('_worker_id', append=True); df2
Out[191]: 
              foo
  _worker_id     
0 A             1
1 B             2
2 C             3

请注意,当名称_worker_idfoo时,它会在foo下方一行显示. 索引级别,并且当foo是列时与foo在同一行.那是唯一的 查看DataFrame的strrepr时得到的视觉提示.

Notice that the name _worker_id appears one line below foo when it is an index level, and on the same line as foo when it is a column. That is the only visual clue you get when looking at the str or repr of a DataFrame.

因此重复:_worker_index是列时,该列不受以下内容的影响 df.reset_index(drop=True):

So to repeat: When _worker_index is a column, the column is unaffected by df.reset_index(drop=True):

In [194]: df.reset_index(drop=True)
Out[194]: 
  _worker_id  foo
0          A    1
1          B    2
2          C    3

但是_worker_index当它是索引的一部分时被删除:

But _worker_index is dropped when it is part of the index:

In [195]: df2.reset_index(drop=True)
Out[195]: 
   foo
0    1
1    2
2    3

这篇关于为什么'reset_index(drop = True)'函数会不必要地删除列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆