feature_columns的项目必须为_FeatureColumn [英] Items of feature_columns must be a _FeatureColumn

查看:132
本文介绍了feature_columns的项目必须为_FeatureColumn的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我收到此错误:

ValueError:feature_columns的项必须为_FeatureColumn.给定(类型):Index(['CreditScore','年龄','网球','平衡','NumOfProducts','HasCrCard','IsActiveMember','EstimatedSalary','Exited'],dtype ='object').

ValueError: Items of feature_columns must be a _FeatureColumn. Given (type ): Index(['CreditScore', 'Age', 'Tenure', 'Balance', 'NumOfProducts', 'HasCrCard', 'IsActiveMember', 'EstimatedSalary', 'Exited'], dtype='object').

我正在使用tensorFlow lib.我想获得预测结果,但无法运行 m.train(input_fn = get_input_fn,steps = 5000)代码.无论执行什么操作,我总是会遇到相同的错误.我在下面使用了这些输入功能,但没有任何改变.

I am using tensorFlow lib. I want to get prediction results but I can not run m.train(input_fn=get_input_fn ,steps=5000) code. I always got the same error whatever I did. I used these input functions in the following but nothing changed.

def input_fn_train():
     x=tf.constant(df_train.astype(np.float64)),
     y=tf.constant(df_train[LABEL].astype(np.float64))
     return x, y

def get_input_fn(data_set, num_epochs=None, shuffle=False):
     return tf.estimator.inputs.pandas_input_fn(
      x=pd.DataFrame({k: data_set[k].values for k in data_set.columns}),
      y=pd.Series(data_set[LABEL].values), num_epochs=num_epochs,
                  shuffle=shuffle)

我不知道该怎么办.错误是关于什么的?我一直在谷歌搜索,但从未发现有用的东西.我该如何处理此错误.代码如下.谢谢!

I can not understand what should I do. What the error is about? I've been googling but never found useful thing. How can I handle this error. The code is below. Thanks!

import pandas as pd
import tensorflow as tf
import numpy as np
import tempfile

COLS= ["RowNumber","CustomerId","Surname","CreditScore","Geography",
"Gender","Age","Tenure","Balance","NumOfProducts","HasCrCard",
"IsActiveMember","EstimatedSalary","Exited"]


FEATURES = ["CreditScore","Age","Tenure","Balance","NumOfProducts",
       "HasCrCard","IsActiveMember", "EstimatedSalary"]

LABEL="Exited"

df_train = pd.read_csv("Churn_Modelling.csv", skipinitialspace=True, 
header=0)
df_test = pd.read_csv("Churn_Modelling.csv", skipinitialspace=True, 
header=0)
test_label = df_test[LABEL].astype(float)
df_test.drop("Surname", axis = 1, inplace=True)
df_test.drop("RowNumber", axis = 1, inplace=True)
df_test.drop("CustomerId", axis = 1, inplace=True)
df_train.drop("CustomerId", axis = 1, inplace=True)
df_train.drop("Surname", axis = 1, inplace=True)
df_train.drop("RowNumber", axis = 1, inplace=True)
df_train.drop("Geography", axis = 1, inplace=True)
df_train.drop("Gender", axis = 1, inplace=True)

def get_input_fn():
    return {'x': tf.constant(df_train[FEATURES].as_matrix(), tf.float32, 
           df_train.shape),
           'y': tf.constant(df_train[LABEL].as_matrix(), tf.float32, 
            df_train.shape)
           }

 df=df_train.select_dtypes(exclude=['object'])
 numeric_cols=df.columns

 m = tf.estimator.LinearClassifier(model_dir=model_dir, feature_columns=
[numeric_cols])

 m.train(input_fn=get_input_fn ,steps=5000)
 results = m.evaluate(input_fn= get_input_fn(df_test, num_epochs=1, 
 shuffle=False),steps=None)

 y = m.predict(input_fn=get_input_fn(df_test, num_epochs=1, shuffle=False))
 pred = list(y)

 rowNumber=0
 for i in pred:
     print(str(rowNumber)+': '+str(pred[i]))
     rowNumber=rowNumber+1

推荐答案

运行正常.

import pandas as pd
import tensorflow as tf
import tempfile
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import cohen_kappa_score
from sklearn.metrics import f1_score
from sklearn.metrics import recall_score


def split_data(data, rate, label):
    data = data.dropna()

    train_data, test_data = train_test_split(data, test_size=rate)

    train_label = train_data[label]
    train_data = train_data.drop(label, 1)

    test_label = test_data[label]
    test_data = test_data.drop(label, 1)
    return train_data, train_label, test_data, test_label



LABEL = "Exited"

data = pd.read_csv("Churn_Modelling.csv", skipinitialspace=True, 
    header=0)

data.drop("Surname", axis=1, inplace=True)
data.drop("RowNumber", axis=1, inplace=True)
data.drop("CustomerId", axis=1, inplace=True)
data.drop("Geography", axis=1, inplace=True)
data.drop("Gender", axis=1, inplace=True)
x_train, y_train, x_test, y_test = split_data(data, 0.20, LABEL)



def get_input_fn_train():
    input_fn = tf.estimator.inputs.pandas_input_fn(
        x=x_train,
        y=y_train,
        shuffle=False
    )
    return input_fn

def get_input_fn_test():
    input_fn = tf.estimator.inputs.pandas_input_fn(
        x=x_test,
        y=y_test,
        shuffle=False
    )
    return input_fn


feature_columns = tf.contrib.learn.infer_real_valued_columns_from_input_fn
(get_input_fn_train())


model_dir = tempfile.mkdtemp()
m = tf.estimator.LinearClassifier(model_dir=model_dir, 
feature_columns=feature_columns)

# train data
m.train(input_fn=get_input_fn_train(), steps=5000)

# you can get accuracy, accuracy_baseline, auc, auc_precision_recall, 
#average_loss, global_step, label/mean, lossprediction/mean

results = m.evaluate(input_fn=get_input_fn_test(), steps=None)

print("model directory = %s" % model_dir)
for key in sorted(results):
    print("%s: %s" % (key, results[key]))

# get prediction results
y = m.predict(input_fn=get_input_fn_test())
predictions = list(y)
pred1=pd.DataFrame(data=predictions)
prediction=pd.DataFrame(data=pred1['class_ids'])
pred=[]
for row in prediction["class_ids"]:
    pred.append(row[0])

rowNumber = 0
for i in pred:
    print(str(rowNumber) + ': ' + str(i))
    rowNumber = rowNumber + 1


def calculate(prediction, LABEL):
    arr = {"accuracy": accuracy_score(prediction, LABEL),
           "report": classification_report(prediction, LABEL),
           "Confusion_Matrix": confusion_matrix(prediction, LABEL),
           "F1 score": f1_score(prediction, LABEL),
           "Recall Score": recall_score(prediction, LABEL),
           "cohen_kappa": cohen_kappa_score(prediction, LABEL)
           }
    return arr


pred2 = pd.DataFrame(data=pred)

print(calculate(pred2.round(), y_test))

这篇关于feature_columns的项目必须为_FeatureColumn的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆