替换 pandas 数据框中的数值 [英] Replace numeric values in a pandas dataframe

查看:57
本文介绍了替换 pandas 数据框中的数值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题:受污染的数据框.
详细信息:框架由我知道的NaN字符串值和数字值组成.
任务:用NaN替换数值
示例

Problem: Polluted Dataframe.
Details: Frame consists of NaNs string values which i know the meaning of and numeric values.
Task: Replaceing the numeric values with NaNs
Example

import numpy as np
import pandas as pd
df = pd.DataFrame([['abc', 'cdf', 1], ['k', 'sum', 'some'], [1000, np.nan, 'nothing']])

退出:

      0    1        2
0   abc  cdf        1
1     k  sum     some
2  1000  NaN  nothing

尝试1 (不起作用,因为正则表达式仅查看字符串单元格)

Attempt 1 (Does not work, because regex only looks at string cells)

df.replace({'\d+': np.nan}, regex=True)

退出:

      0    1        2
0   abc  cdf        1
1     k  sum     some
2  1000  NaN  nothing

初步解决方案

val_set = set()
[val_set.update(i) for i in df.values]

def dis_nums(myset):
    str_s = set()
    num_replace_dict = {}
    for i in range(len(myset)):
        val = myset.pop()
        if type(val) == str:
            str_s.update([val])
        else:
            num_replace_dict.update({val:np.nan})
    return str_s, num_replace_dict

strs, rpl_dict = dis_nums(val_set)

df.replace(rpl_dict, inplace=True)

退出:

     0    1        2
0  abc  cdf      NaN
1    k  sum     some
2  NaN  NaN  nothing

问题 有没有更简单/更愉快的解决方案?

Question Is there any easier/ more pleasant solution?

推荐答案

您可以对str进行舍入转换以替换值并返回.

You can do a round-conversion to str to replace the values and back.

df.astype('str').replace({'\d+': np.nan, 'nan': np.nan}, regex=True).astype('object')
#This makes sure already existing np.nan are not lost

输出

    0   1   2
0   abc cdf NaN
1   k   sum some
2   NaN NaN nothing

这篇关于替换 pandas 数据框中的数值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆