pandas read_csv并将na_values设置为csv文件中的任何字符串 [英] pandas read_csv and setting na_values to any string in the csv file

查看:185
本文介绍了 pandas read_csv并将na_values设置为csv文件中的任何字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


data.csv

data.csv

1,22,3432

1, 22, 3432

1,23,\N

2,24,54335

2, 24, 54335

2,25,3928

2, 25, 3928

我有从设备收集的数据的csv文件。该设备有时不中继信息,而不会输出 outputsN。我想将它们视为NaN并通过

I have a csv file of data that is collected from a device. Every now and then the device doesn't relay information and outputs '\N'. I want to treat these as NaN and did this by doing

read_csv(data.csv, na_values=['\\N']) 

效果很好。但是,我希望不仅将此字符串转换为NaN,而且还要将csv文件中的任何字符串转换为万一,以防万一我将来获得的数据具有不同的字符串。

which worked fine. However, I would prefer to have not only this string turned to NaN but any string that is in the csv file just in case the data I get in the future has a different string.

是我可以对参数进行任何更改,使其覆盖所有字符串吗?

Is it possible to me to make any changes in the argument so it covers all strings?

推荐答案

您必须手动传递所有关键字作为 na_values

You have to manually pass all the keywords as a list or dict to na_values


na_values:类似列表或字典,默认无

na_values : list-like or dict, default None

或者,使用 pd.to_numeric 并将错误设置为 coerce 来转换所有

Alternatively, use pd.to_numeric and set errors to coerce to convert all values to numeric after reading the csv file.

样本输入 df

    A   B        
0   1   2         
1   0  \N      
2  \N   8       
3  11   5       
4  11  Kud   

df = df.apply(pd.to_numeric, errors='coerce')

输出:

     A     B        
0    1     2         
1    0   NaN      
2  NaN     8       
3   11     5       
4   11   NaN   

这篇关于 pandas read_csv并将na_values设置为csv文件中的任何字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆