pandas read_csv并将na_values设置为csv文件中的任何字符串 [英] pandas read_csv and setting na_values to any string in the csv file
问题描述
data.csv
data.csv
1,22,3432
1, 22, 3432
1,23,\N
2,24,54335
2, 24, 54335
2,25,3928
2, 25, 3928
我有从设备收集的数据的csv文件。该设备有时不中继信息,而不会输出 outputsN。我想将它们视为NaN并通过
I have a csv file of data that is collected from a device. Every now and then the device doesn't relay information and outputs '\N'. I want to treat these as NaN and did this by doing
read_csv(data.csv, na_values=['\\N'])
效果很好。但是,我希望不仅将此字符串转换为NaN,而且还要将csv文件中的任何字符串转换为万一,以防万一我将来获得的数据具有不同的字符串。
which worked fine. However, I would prefer to have not only this string turned to NaN but any string that is in the csv file just in case the data I get in the future has a different string.
是我可以对参数进行任何更改,使其覆盖所有字符串吗?
Is it possible to me to make any changes in the argument so it covers all strings?
推荐答案
您必须手动传递所有关键字作为 na_values
You have to manually pass all the keywords as a list or dict to na_values
na_values:类似列表或字典,默认无
na_values : list-like or dict, default None
或者,使用 pd.to_numeric 并将错误设置为 coerce
来转换所有
Alternatively, use pd.to_numeric and set errors to coerce
to convert all values to numeric after reading the csv file.
样本输入 df
:
A B
0 1 2
1 0 \N
2 \N 8
3 11 5
4 11 Kud
df = df.apply(pd.to_numeric, errors='coerce')
输出:
A B
0 1 2
1 0 NaN
2 NaN 8
3 11 5
4 11 NaN
这篇关于 pandas read_csv并将na_values设置为csv文件中的任何字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!