在Pandas DataFrame中将无效值替换为None [英] Replace invalid values with None in Pandas DataFrame

查看:1118
本文介绍了在Pandas DataFrame中将无效值替换为None的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Python的Pandas中,是否有任何方法可以用None替换值?

Is there any method to replace values with None in Pandas in Python?

您可以使用df.replace('pre', 'post')并可以用另一个值替换一个值,但是如果您想用None值替换,则无法完成此操作,如果尝试,则会得到一个奇怪的结果.

You can use df.replace('pre', 'post') and can replace a value with another, but this can't be done if you want to replace with None value, which if you try, you get a strange result.

所以这是一个例子:

df = DataFrame(['-',3,2,5,1,-5,-1,'-',9])
df.replace('-', 0)

返回成功的结果.

但是

df.replace('-', None)

返回以下结果:

0
0   - // this isn't replaced
1   3
2   2
3   5
4   1
5  -5
6  -1
7  -1 // this is changed to `-1`...
8   9

为什么会返回如此奇怪的结果?

Why does such a strange result be returned?

由于我想将此数据框倒入MySQL数据库,所以我不能将NaN值放入数据框的任何元素中,而是想放入None.当然,您可以先将'-'更改为NaN,然后将NaN转换为None,但是我想知道为什么数据帧以这种可怕的方式起作用.

Since I want to pour this data frame into MySQL database, I can't put NaN values into any element in my data frame and instead want to put None. Surely, you can first change '-' to NaN and then convert NaN to None, but I want to know why the dataframe acts in such a terrible way.

已在Python 2.7和OS X 10.8的pandas 0.12.0开发人员上进行了测试. Python是一个 OS X上的预安装版本,我通过使用SciPy安装了熊猫 Superpack脚本,供您参考.

Tested on pandas 0.12.0 dev on Python 2.7 and OS X 10.8. Python is a pre-installed version on OS X and I installed pandas by using SciPy Superpack script, for your information.

推荐答案

实际上,在更高版本的熊猫中,这将导致TypeError:

Actually in later versions of pandas this will give a TypeError:

df.replace('-', None)
TypeError: If "to_replace" and "value" are both None then regex must be a mapping

您可以通过传递列表或字典来实现:

You can do it by passing either a list or a dictionary:

In [11]: df.replace('-', df.replace(['-'], [None]) # or .replace('-', {0: None})
Out[11]:
      0
0  None
1     3
2     2
3     5
4     1
5    -5
6    -1
7  None
8     9

但是我建议使用NaN而不是无:

But I recommend using NaNs rather than None:

In [12]: df.replace('-', np.nan)
Out[12]:
     0
0  NaN
1    3
2    2
3    5
4    1
5   -5
6   -1
7  NaN
8    9

这篇关于在Pandas DataFrame中将无效值替换为None的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆