在Pandas DataFrame中将无效值替换为None [英] Replace invalid values with None in Pandas DataFrame
问题描述
在Python的Pandas中,是否有任何方法可以用None
替换值?
Is there any method to replace values with None
in Pandas in Python?
您可以使用df.replace('pre', 'post')
并可以用另一个值替换一个值,但是如果您想用None
值替换,则无法完成此操作,如果尝试,则会得到一个奇怪的结果.
You can use df.replace('pre', 'post')
and can replace a value with another, but this can't be done if you want to replace with None
value, which if you try, you get a strange result.
所以这是一个例子:
df = DataFrame(['-',3,2,5,1,-5,-1,'-',9])
df.replace('-', 0)
返回成功的结果.
但是
df.replace('-', None)
返回以下结果:
0
0 - // this isn't replaced
1 3
2 2
3 5
4 1
5 -5
6 -1
7 -1 // this is changed to `-1`...
8 9
为什么会返回如此奇怪的结果?
Why does such a strange result be returned?
由于我想将此数据框倒入MySQL数据库,所以我不能将NaN
值放入数据框的任何元素中,而是想放入None
.当然,您可以先将'-'
更改为NaN
,然后将NaN
转换为None
,但是我想知道为什么数据帧以这种可怕的方式起作用.
Since I want to pour this data frame into MySQL database, I can't put NaN
values into any element in my data frame and instead want to put None
. Surely, you can first change '-'
to NaN
and then convert NaN
to None
, but I want to know why the dataframe acts in such a terrible way.
已在Python 2.7和OS X 10.8的pandas 0.12.0开发人员上进行了测试. Python是一个 OS X上的预安装版本,我通过使用SciPy安装了熊猫 Superpack脚本,供您参考.
Tested on pandas 0.12.0 dev on Python 2.7 and OS X 10.8. Python is a pre-installed version on OS X and I installed pandas by using SciPy Superpack script, for your information.
推荐答案
实际上,在更高版本的熊猫中,这将导致TypeError:
Actually in later versions of pandas this will give a TypeError:
df.replace('-', None)
TypeError: If "to_replace" and "value" are both None then regex must be a mapping
您可以通过传递列表或字典来实现:
You can do it by passing either a list or a dictionary:
In [11]: df.replace('-', df.replace(['-'], [None]) # or .replace('-', {0: None})
Out[11]:
0
0 None
1 3
2 2
3 5
4 1
5 -5
6 -1
7 None
8 9
但是我建议使用NaN而不是无:
But I recommend using NaNs rather than None:
In [12]: df.replace('-', np.nan)
Out[12]:
0
0 NaN
1 3
2 2
3 5
4 1
5 -5
6 -1
7 NaN
8 9
这篇关于在Pandas DataFrame中将无效值替换为None的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!