带有Scikit-Learn的Google云ML引发:"dict"对象没有属性"lower" [英] Google cloud ML with Scikit-Learn raises: 'dict' object has no attribute 'lower'

查看：119 发布时间：2020/5/4 9:19:14 python machine-learning scikit-learn gcloud

本文介绍了带有Scikit-Learn的Google云ML引发:"dict"对象没有属性"lower"的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我使用以下教程在Google Cloud中使用我的Scikit学习情感分析模型: https://cloud.google.com/ml-engine/docs/scikit/快速入门

I used the following tutorial to use my Scikit-learn sentiment-analysis model in Google Cloud: https://cloud.google.com/ml-engine/docs/scikit/quickstart

我的模型定义如下:

import csv
import os
from collections import defaultdict
import sys
import re
import numpy as np
import random
import math

import sklearn.datasets
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.naive_bayes import MultinomialNB
from sklearn.pipeline import Pipeline
from sklearn.linear_model import SGDClassifier
from sklearn import metrics
from sklearn.model_selection import GridSearchCV
from sklearn.externals import joblib
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score

def build_data_cv(datafile, trait_number):
    """
    Loads data
    """

    with open(datafile, "rb") as csvf:
        csvreader = csv.reader(csvf,delimiter=',',quotechar='"')
        data = []
        target = np.array([])

        for index, line in enumerate(csvreader):

            # escape header
            if index < 1:
                continue

            document = unicode(line[1], errors='replace')

            data.append(document)

            index_of_trait = trait_number + 2

            if line[index_of_trait].lower()=='y':
                target = np.append(target, 1.0)
            else:
                target = np.append(target, 0.0)

        dataset = sklearn.datasets.base.Bunch(data=data, target=target)
        dataset.target_names = ["positive", "negative"]

        return dataset

# main program
if __name__=="__main__":
    current_directory = os.getcwd() + "/"
    data_file = current_directory + "essays.csv"

    class_labels = ['EXT','NEU','AGR','CON','OPN']

    for index, selected_trait in enumerate(class_labels):
        print selected_trait

        dataset = build_data_cv(data_file, index)

        X_train, X_test, y_train, y_test = train_test_split(dataset.data, dataset.target, test_size=0.2, random_state=0)

        clf = Pipeline([('vect', CountVectorizer()),
                      ('tfidf', TfidfTransformer()),
                      ('clf', SGDClassifier(loss='hinge', penalty='l2',
                                            alpha=1e-3, random_state=42,
                                            max_iter=5, tol=None)),
        ])

        # clf.fit(X_train, y_train)
        parameters = {'vect__ngram_range': [(1, 1), (1, 2)],
               'tfidf__use_idf': (True, False),
               'clf__alpha': (1e-2, 1e-3),
        }
        gs_clf = GridSearchCV(clf, parameters, n_jobs=-1)

        # fit the model
        gs_clf.fit(X_train, y_train)

        # simple test score
        # print clf.score(X_test, y_test)

        # 10-fold cross-validation score
        scores = cross_val_score(gs_clf, dataset.data, dataset.target, cv=10)
        print("Accuracy: %0.4f (+/- %0.4f)" % (scores.mean(), scores.std() * 2))

        # Export the classifier to a file
        joblib.dump(gs_clf, 'svm_gs_'+selected_trait+'.joblib')

        print "______________________"

输入文件在这里可用: https://github.com/novinfard/profiler-sentiment-analysis/blob/master/model/input_dataset/essays.csv

And the input file is available here: https://github.com/novinfard/profiler-sentiment-analysis/blob/master/model/input_dataset/essays.csv

当我想使用 gcloud ml-使用以下shell命令对引擎进行本地预测:

    MODEL_DIR="gs://MY_BUCKET/"     
    INPUT_FILE="input.json"     
    FRAMEWORK="SCIKIT_LEARN"

   gcloud ml-engine local predict --model-dir=$MODEL_DIR \
   --json-instances $INPUT_FILE \     
   --framework $FRAMEWORK

它会引起以下错误:

  File "/Users/XXX/google-cloud-sdk/lib/third_party/ml_sdk/cloud/ml/prediction/frameworks/sk_xg_prediction_lib.py", line 57, in predict
    "Exception during sklearn prediction: " + str(e))
cloud.ml.prediction.prediction_utils.PredictionError: Failed to run the provided model: Exception during sklearn prediction: 'dict' object has no attribute 'lower' (Error code: 2)

input.json定义如下:

The input.json defined as below:

{"instances": [["the quick brown fox"],["another test"]]}

问题是什么，如何解决?

What is the issue and how it can be solved?

带有Scikit-Learn的Google云ML引发:"dict"对象没有属性"lower" [英] Google cloud ML with Scikit-Learn raises: 'dict' object has no attribute 'lower'

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

带有Scikit-Learn的Google云ML引发:"dict"对象没有属性"lower" [英] Google cloud ML with Scikit-Learn raises: &#39;dict&#39; object has no attribute &#39;lower&#39;

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

带有Scikit-Learn的Google云ML引发:"dict"对象没有属性"lower" [英] Google cloud ML with Scikit-Learn raises: 'dict' object has no attribute 'lower'

登录关闭