如何使用 pandas 用空字符串仅替换无? [英] How to replace None only with empty string using pandas?
本文介绍了如何使用 pandas 用空字符串仅替换无?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
下面的代码生成一个 df :
import pandas as pd
from datetime import datetime as dt
import numpy as np
dates = [dt(2014, 1, 2, 2), dt(2014, 1, 2, 3), dt(2014, 1, 2, 4), None]
strings1 = ['A', 'B',None, 'C']
strings2 = [None, 'B','C', 'C']
strings3 = ['A', 'B','C', None]
vals = [1.,2.,np.nan, 4.]
df = pd.DataFrame(dict(zip(['A','B','C','D','E'],
[strings1, dates, strings2, strings3, vals])))
+---+------+---------------------+------+------+-----+
| | A | B | C | D | E |
+---+------+---------------------+------+------+-----+
| 0 | A | 2014-01-02 02:00:00 | None | A | 1 |
| 1 | B | 2014-01-02 03:00:00 | B | B | 2 |
| 2 | None | 2014-01-02 04:00:00 | C | C | NaN |
| 3 | C | NaT | C | None | 4 |
+---+------+---------------------+------+------+-----+
我想用''
(空字符串)替换其中的所有None
(在python中为真正的None
,而不是str).
I would like to replace all None
(real None
in python, not str) inside with ''
(empty string).
预期 df 是
+---+---+---------------------+---+---+-----+
| | A | B | C | D | E |
+---+---+---------------------+---+---+-----+
| 0 | A | 2014-01-02 02:00:00 | | A | 1 |
| 1 | B | 2014-01-02 03:00:00 | B | B | 2 |
| 2 | | 2014-01-02 04:00:00 | C | C | NaN |
| 3 | C | NaT | C | | 4 |
+---+---+---------------------+---+---+-----+
我所做的是
what I did is
df = df.replace([None], [''], regex=True)
但是我知道了
+---+---+---------------------+---+------+---+
| | A | B | C | D | E |
+---+---+---------------------+---+------+---+
| 0 | A | 1388628000000000000 | | A | 1 |
| 1 | B | 1388631600000000000 | B | B | 2 |
| 2 | | 1388635200000000000 | C | C | |
| 3 | C | | C | | 4 |
+---+---+---------------------+---+------+---+
- 所有日期变成大数字
- 甚至我都不想替换
NaT
和NaN
.
我如何正确有效地做到这一点?
How can I achieve that correctly and efficently?
推荐答案
看来None
被提升为NaN
,因此您不能像往常一样使用replace
,以下方法有效:
It looks like None
is being promoted to NaN
and so you cannot use replace
like usual, the following works:
In [126]:
mask = df.applymap(lambda x: x is None)
cols = df.columns[(mask).any()]
for col in df[cols]:
df.loc[mask[col], col] = ''
df
Out[126]:
A B C D E
0 A 2014-01-02 02:00:00 A 1
1 B 2014-01-02 03:00:00 B B 2
2 2014-01-02 04:00:00 C C NaN
3 C NaT C 4
因此,我们使用applymap
生成None
值的掩码,然后使用此掩码对感兴趣的每一列进行迭代,并使用布尔值掩码设置值.
So we generate a mask of the None
values using applymap
, we then use this mask to iterate over each column of interest and using the boolean mask set the values.
这篇关于如何使用 pandas 用空字符串仅替换无?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文