Python ValueError：ColumnTransformer，列顺序不相等 [英] Python ValueError : ColumnTransformer, Column Ordering is Not Equal

查看：190 发布时间：2020/10/17 22:21:31 python pandas scikit-learn data-science

本文介绍了Python ValueError：ColumnTransformer，列顺序不相等的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我组合了以下函数，这些函数读取csv，训练模型并预测请求数据。

I put together the following function that read csv, train the model and predict the request data.

我有以下ValueError：列顺序必须相等

I've got the following ValueError : Column ordering must be equal for fit and for transform when using the remainder keyword

训练数据和用于预测的数据具有完全相同的列数，例如15。我不确定

The training data and the data used for prediction has exact the same number of column , e.g., 15. I am not sure how the "ordering" of the column could have changed.

~/.local/lib/python3.5/site-packages/sklearn/pipeline.py in predict(self, X, **predict_params)
    417         Xt = X
    418         for _, name, transform in self._iter(with_final=False):
--> 419             Xt = transform.transform(Xt)
    420         return self.steps[-1][-1].predict(Xt, **predict_params)
    421 

~/.local/lib/python3.5/site-packages/sklearn/compose/_column_transformer.py in transform(self, X)
    581             if (n_cols_transform >= n_cols_fit and
    582                     any(X.columns[:n_cols_fit] != self._df_columns)):
--> 583                 raise ValueError('Column ordering must be equal for fit '
    584                                  'and for transform when using the '
    585                                  'remainder keyword')

ValueError: Column ordering must be equal for fit and for transform when using the remainder keyword

Function：

numeric_transformer = Pipeline(steps=[

    ('imputer', SimpleImputer(strategy='median')),
    ('scaler', StandardScaler())])

categorical_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='constant', fill_value='missing')),
    ('onehot', OneHotEncoder(handle_unknown='ignore'))])

preprocessor = ColumnTransformer(
    transformers=[
        ('num', numeric_transformer, numeric_features),
        ('cat', categorical_transformer, categorical_features)])

#Putting data transformation and the model in a pipeline
rf = Pipeline(steps=[('preprocessor', preprocessor),
                     ('classifier', RandomForestClassifier(
                                        n_estimators=500,
                                        criterion="gini",
                                        max_features="sqrt",
                                        min_samples_leaf=4))])

rf.fit(X_train, y_train)

request_data = {'A': [request.A],
                'B': [request.B],
                'C': [request.C],
                'D': [request.D],
                'E': [request.E],
                'F': [request.F],
                'G': [request.G],
                'H': [request.H],
                'I': [request.I],
                'J': [request.J],
                'K': [request.K],
                'L': [request.L],
                'M': [request.M],
                'N': [request.N],
                'O': [request.O]}

df_resp = pd.DataFrame(data=request_data)
response = rf.predict(df_resp)

output = {"Safety Rating": response[0]}

return output

Python ValueError：ColumnTransformer，列顺序不相等 [英] Python ValueError : ColumnTransformer, Column Ordering is Not Equal

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python ValueError：ColumnTransformer，列顺序不相等 [英] Python ValueError : ColumnTransformer, Column Ordering is Not Equal

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭