TypeError:float()参数必须是字符串或数字,而不是'function'– Python/Sklearn [英] TypeError: float() argument must be a string or a number, not 'function' – Python/Sklearn

查看:109
本文介绍了TypeError:float()参数必须是字符串或数字,而不是'function'– Python/Sklearn的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个名为Flights.py的程序的以下代码段

I have the following code snippet from a program called Flights.py

...
#Load the Dataset
df = dataset
df.isnull().any()
df = df.fillna(lambda x: x.median())

# Define X and Y
X = df.iloc[:, 2:124].values
y = df.iloc[:, 136].values
X_tolist = X.tolist()

# Splitting the dataset into the Training set and Test set
from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

倒数第二行抛出以下错误:

The second to last line is throwing the following error:

Traceback (most recent call last):

  File "<ipython-input-14-d4add2ccf5ab>", line 3, in <module>
    X_train = sc.fit_transform(X_train)

  File "/Users/<username>/anaconda/lib/python3.6/site-packages/sklearn/base.py", line 494, in fit_transform
    return self.fit(X, **fit_params).transform(X)

  File "/Users/<username>/anaconda/lib/python3.6/site-packages/sklearn/preprocessing/data.py", line 560, in fit
    return self.partial_fit(X, y)

  File "/Users/<username>/anaconda/lib/python3.6/site-packages/sklearn/preprocessing/data.py", line 583, in partial_fit
    estimator=self, dtype=FLOAT_DTYPES)

  File "/Users/<username>/anaconda/lib/python3.6/site-packages/sklearn/utils/validation.py", line 382, in check_array
    array = np.array(array, dtype=dtype, order=order, copy=copy)

TypeError: float() argument must be a string or a number, not 'function'

我的数据框df的大小为(22587,138)

My dataframe df is of size (22587, 138)

我正在研究以下问题以获取灵感:

I was taking a look at the following question for inspiration:

TypeError :float()参数必须是字符串或数字,而不是Geocoder中的方法"

我尝试了以下调整:

# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train.as_matrix)
X_test = sc.transform(X_test.as_matrix)

这导致了以下错误:

AttributeError: 'numpy.ndarray' object has no attribute 'as_matrix'

我现在对如何通过数据框进行扫描以及查找/转换有问题的条目一无所知.

I'm currently at a loss for how to scan thru the dataframe and find/convert the offending entries.

推荐答案

此答案所述,fillna不是不能与回调一起使用.如果传递一个,它将作为文字填充值,这意味着您的NaN将被lambda代替:

As this answer explains, fillna isn't designed to work with a callback. If you pass one, it will be taken as the literal fill value, meaning your NaNs will be replaced with lambdas:

df

      col1  col2  col3  col4
row1  65.0    24  47.0   NaN
row2  33.0    48   NaN  89.0
row3   NaN    34  67.0   NaN
row4  24.0    12  52.0  17.0

df4.fillna(lambda x: x.median())

                                    col1  col2  \
row1                                  65    24   
row2                                  33    48   
row3  <function <lambda> at 0x10bc47730>    34   
row4                                  24    12   

                                    col3                                col4  
row1                                  47  <function <lambda> at 0x10bc47730>  
row2  <function <lambda> at 0x10bc47730>                                  89  
row3                                  67  <function <lambda> at 0x10bc47730>  
row4                                  52                                  17 


如果您要按中位数填充,解决方案是根据该列创建一个中位数数据框,然后将其传递给fillna.

df
      col1  col2  col3  col4
row1  65.0    24  47.0   NaN
row2  33.0    48   NaN  89.0
row3   NaN    34  67.0   NaN
row4  24.0    12  52.0  17.0

df.fillna(df.median())
df 
      col1  col2  col3  col4
row1  65.0    24  47.0  53.0
row2  33.0    48  52.0  89.0
row3  33.0    34  67.0  53.0
row4  24.0    12  52.0  17.0

这篇关于TypeError:float()参数必须是字符串或数字,而不是'function'– Python/Sklearn的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆