Python Pandas:如何删除nan和-inf值 [英] Python pandas: how to remove nan and -inf values

查看:3370
本文介绍了Python Pandas:如何删除nan和-inf值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据框

           time       X    Y  X_t0     X_tp0  X_t1     X_tp1  X_t2     X_tp2
0         0.002876    0   10     0       NaN   NaN       NaN   NaN       NaN
1         0.002986    0   10     0       NaN     0       NaN   NaN       NaN
2         0.037367    1   10     1  1.000000     0       NaN     0       NaN
3         0.037374    2   10     2  0.500000     1  1.000000     0       NaN
4         0.037389    3   10     3  0.333333     2  0.500000     1  1.000000
5         0.037393    4   10     4  0.250000     3  0.333333     2  0.500000

....
1030308   9.962213  256  268   256  0.000000   256  0.003906   255  0.003922
1030309  10.041799    0  268     0      -inf   256  0.000000   256  0.003906
1030310  10.118960    0  268     0       NaN     0      -inf   256  0.000000

我尝试了以下方法

df.dropna(inplace=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.40)

X_train = X_train.drop('time', axis=1)
X_train = X_train.drop('X_t1', axis=1)
X_train = X_train.drop('X_t2', axis=1)
X_test = X_test.drop('time', axis=1)
X_test = X_test.drop('X_t1', axis=1)
X_test = X_test.drop('X_t2', axis=1)
X_test.fillna(X_test.mean(), inplace=True)
X_train.fillna(X_train.mean(), inplace=True)
y_train.fillna(y_train.mean(), inplace=True)

但是,每当我尝试拟合回归模型fit(X_train, y_train)

However, I am still getting this error ValueError: Input contains NaN, infinity or a value too large for dtype('float32'). whenever i try to fit a regression model fit(X_train, y_train)

如何同时删除NaN-inf值?

推荐答案

使用pd.DataFrame.isin并检查带有pd.DataFrame.any的行.最后,使用布尔数组对数据帧进行切片.

Use pd.DataFrame.isin and check for rows that have any with pd.DataFrame.any. Finally, use the boolean array to slice the dataframe.

df[~df.isin([np.nan, np.inf, -np.inf]).any(1)]

             time    X    Y  X_t0     X_tp0   X_t1     X_tp1   X_t2     X_tp2
4        0.037389    3   10     3  0.333333    2.0  0.500000    1.0  1.000000
5        0.037393    4   10     4  0.250000    3.0  0.333333    2.0  0.500000
1030308  9.962213  256  268   256  0.000000  256.0  0.003906  255.0  0.003922

这篇关于Python Pandas:如何删除nan和-inf值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆