将值设置为 pandas 数据框的整个列 [英] Set value to an entire column of a pandas dataframe

查看:46
本文介绍了将值设置为 pandas 数据框的整个列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将数据框的整个列设置为特定值.

I'm trying to set the entire column of a dataframe to a specific value.

In  [1]: df
Out [1]: 
     issueid   industry
0        001        xxx
1        002        xxx
2        003        xxx
3        004        xxx
4        005        xxx

从我所见,loc是替换数据框中的值的最佳实践(不是吗?):

From what I've seen, loc is the best practice when replacing values in a dataframe (or isn't it?):

In  [2]: df.loc[:,'industry'] = 'yyy'

但是,我仍然收到这样的话题:

However, I still received this much talked-about warning message:

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead

如果我愿意

In  [3]: df['industry'] = 'yyy'

我收到了同样的警告消息.

I got the same warning message.

有什么想法吗?使用Python 3.5.2和pandas 0.18.1.

Any ideas? Working with Python 3.5.2 and pandas 0.18.1.

推荐答案

从现有对象定义新对象时,Python可能会发生意外的事情.您在上面的评论中指出,您的数据框是根据df = df_all.loc[df_all['issueid']==specific_id,:]行定义的.在这种情况下,df实际上只是存储在df_all对象中的行的替代品:在内存中不会创建新对象.

Python can do unexpected things when new objects are defined from existing ones. You stated in a comment above that your dataframe is defined along the lines of df = df_all.loc[df_all['issueid']==specific_id,:]. In this case, df is really just a stand-in for the rows stored in the df_all object: a new object is NOT created in memory.

为完全避免这些问题,我经常不得不提醒自己使用copy模块,该模块显式强制将对象复制到内存中,以便在新对象上调用的方法不应用于源对象.我遇到了与您相同的问题,并使用deepcopy函数避免了它.

To avoid these issues altogether, I often have to remind myself to use the copy module, which explicitly forces objects to be copied in memory so that methods called on the new objects are not applied to the source object. I had the same problem as you, and avoided it using the deepcopy function.

对于您而言,这应该摆脱警告消息:

In your case, this should get rid of the warning message:

from copy import deepcopy
df = deepcopy(df_all.loc[df_all['issueid']==specific_id,:])
df['industry'] = 'yyy'


编辑:另请参阅下面的David M.的精彩评论!


EDIT: Also see David M.'s excellent comment below!

df = df_all.loc[df_all['issueid']==specific_id,:].copy()
df['industry'] = 'yyy'

这篇关于将值设置为 pandas 数据框的整个列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆