pandas :SettingWithCopyWarning,试图了解如何更好地编写代码,而不仅仅是是否忽略警告 [英] Pandas: SettingWithCopyWarning, trying to understand how to write the code better, not just whether to ignore the warning

查看:128
本文介绍了 pandas :SettingWithCopyWarning,试图了解如何更好地编写代码,而不仅仅是是否忽略警告的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图将早于1900年的电子表格的日期"列中的所有日期值更改为今天的日期,所以我有一个切面.

I am trying to change all date values in a spreadsheet's Date column where the year is earlier than 1900, to today's date, so I have a slice.

前几行代码:

df=pd.read_excel(filename)#,usecols=['NAME','DATE','EMAIL']
#regex to remove weird characters
df['DATE'] = df['DATE'].str.replace(r'[^a-zA-Z0-9\._/-]', '')
df['DATE'] = pd.to_datetime(df['DATE'])

sample row in dataframe: name, date, email
[u'Public, Jane Q.\xa0' u'01/01/2016\xa0' u'jqpublic@email.com\xa0'] 

这行代码有效.

df["DATE"][df["DATE"].dt.year < 1900] = dt.datetime.today()

然后,所有日期值都被格式化:

Then, all date values are formatted:

df["DATE"] = df["DATE"].map(lambda x: x.strftime("%m/%d/%y"))

但是我得到一个错误:

SettingWithCopyWarning:  A value is trying to be set on a copy of a
slice from a DataFrame

See the caveats in the documentation:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-

与副本

我已经阅读了文档和其他文章,其中建议使用.loc

I have read the documentation and other posts, where using .loc is suggested

以下是推荐的解决方案:

The following is the recommended solution:

df.loc[row_indexer,col_indexer] = value

但是df["DATE"].loc[df["DATE"].dt.year < 1900] = dt.datetime.today()给了我同样的错误,只是行号实际上是脚本中最后一行之后的行号.

but df["DATE"].loc[df["DATE"].dt.year < 1900] = dt.datetime.today() gives me the same error, except that the line number is actually the line number after the last line in the script.

我只是不理解文档试图告诉我的内容,因为它与我的示例有关.

I just don't understand what the documentation is trying to tell me as it relates to my example.

我开始弄乱切片并分配给一个单独的数据框,但随后我将不得不再次将它们放在一起.

I started messing around with pulling out the slice and assigning to a separate dataframe, but then I'm going to have to bring them together again.

推荐答案

当您df["DATE"]并随后使用选择器[df["DATE"].dt.year < 1900]并尝试为其分配视图时,您将生成一个视图.

You are producing a view when you df["DATE"] and subsequently use a selector [df["DATE"].dt.year < 1900] and try to assign to it.

df["DATE"][df["DATE"].dt.year < 1900]是熊猫抱怨的观点.

像这样用loc修复它:

df.loc[df.DATE.dt.year < 1900, "DATE"] = pd.datetime.today()

这篇关于 pandas :SettingWithCopyWarning,试图了解如何更好地编写代码,而不仅仅是是否忽略警告的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆