通过使用正则表达式将值替换为np.nan [英] replace value by using regex to np.nan

查看：57 发布时间：2020/5/23 21:43:33 python pandas

本文介绍了通过使用正则表达式将值替换为np.nan的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个数据框，如下所示:

I have a dataframe as below :

data1 = {"first":["alice", "bob", "carol"],
         "last_huge":["foo", "bar", "baz"]}
df = pd.DataFrame(data1)

例如，我想将所有字符"o"替换为"a":

For example , I want to replace all character 'o' to 'a':

然后我做

df.replace({"o":"a"},regex=True)
Out[668]: 
   first last
0  alice  faa
1    bab  bar
2  caral  baz

它还给我我需要的东西.

It give back what I need .

但是，当我要将'o'替换为np.nan时，它将把整个字符串更改为np.nan. 熊猫的文档中有任何解释吗? ?我可以通过

However, when I want to replace 'o' to np.nan , It will change entire string to np.nan. Is there any explanation from pandas' document? I can find some information through the source code .

更多信息:(它将整个字符串更改为np.nan)

More Information:(It will change whole string to np.nan)

df.replace({"o":np.nan},regex=True)
Out[669]: 
   first last
0  alice  NaN
1    NaN  bar
2    NaN  baz

推荐答案

NaN始终用作丢失的占位符，当用"missing"替换字符串的一部分时，这仅意味着整个条目已被破坏.我听说过这种叫做NaN污染的方法(或类似方法，我会看看是否能找到一些参考资料)，因为如果NaN接触到，数据就会受到损害.

NaN is consistently used as a placeholder for missing, when replacing part of a string with "missing" it can only mean the entire entry is compromised. I've heard this called NaN pollution (or similar, will see if I can find some references), in that if NaN touches the data is compromised.

也就是说，并非总是如此:

That said, that's not always the case:

In [11]: s = pd.Series([1, 2, np.nan, 4])

In [12]: s.sum()
Out[12]: 7.0

In [13]: s.sum(skipna=False)
Out[13]: nan

在某些语言中，您会看到skipna = False作为默认行为，有些人激烈地争论说NaN应该始终污染所有数据.熊猫采取了一种更为务实的方法...

In some languages you'll see skipna=False as the default behaviour, some vehemently argue that NaN should always pollute all data. Pandas takes a somewhat more pragmatic approach...

真正的问题是，在NaN的情况下，您希望它做什么?

The real question is what do you expect it to do in the case of NaN?

这篇关于通过使用正则表达式将值替换为np.nan的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

通过使用正则表达式将值替换为np.nan [英] replace value by using regex to np.nan

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

通过使用正则表达式将值替换为np.nan [英] replace value by using regex to np.nan

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭