sagemaker中的逻辑回归 [英] Logistic regression in sagemaker

查看:33
本文介绍了sagemaker中的逻辑回归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 aws sagemaker 进行逻辑回归.为了在测试数据上验证模型,使用以下代码

I am using the aws sagemaker for logistic regression. For validating the model on test data, the following code is used

runtime= boto3.client('runtime.sagemaker')

payload = np2csv(test_X)
response = runtime.invoke_endpoint(EndpointName=linear_endpoint,
                                   ContentType='text/csv',
                                   Body=payload)
result = json.loads(response['Body'].read().decode())
test_pred = np.array([r['score'] for r in result['predictions']])

结果包含预测值和概率分数.我想知道如何运行预测模型来根据两个特定特征预测结果.例如.我在模型中有 30 个特征,并使用这些特征训练了模型.现在对于我的预测,我想知道 feature1='x' 和 feature2='y' 时的结果.但是当我将数据过滤到这些列并在相同的代码中传递时,我收到以下错误.

The result contains the prediction values and the probability scores. I want to know how I can run a prediction model to predict the outcome based on two specific features. Eg. I have 30 features in the model and have trained model using those features. Now for my prediction, I want to know the outcome when feature1='x' and feature2='y'. But when I filter the data to those columns and pass that in the same code, I get the following error.

Customer Error: The feature dimension of the input: 4 does not match the feature dimension of the model: 30. Please fix the input and try again.

AWS Sagemaker 实现中 R 中 say glm.predict('feature1','feature2') 的等价物是什么?

What is the equivalent of say glm.predict('feature1','feature2')in R in AWS Sagemaker implementation?

推荐答案

当您在数据上训练回归模型时,您正在学习从输入特征到响应变量的映射.然后,您可以使用该映射通过向模型提供新的输入特征来进行预测.

When you train a regression model on data, you're learning a mapping from the input features to the response variable. You then use that mapping to make predictions by feeding new input features to the model.

如果您在 30 个特征上训练了一个模型,则不可能使用同一个模型仅用 2 个特征进行预测.您必须为其他 28 个功能提供值.

If you trained a model on 30 features, it's not possible to use that same model to predict with only 2 of the features. You would have to supply values for the other 28 features.

如果您只想知道这两个特征如何影响预测,那么您可以查看训练模型的权重(也称为参数"或系数").如果特征 1 的权重为 x,那么当特征 1 增加 1.0 时,预测响应增加 x.

If you just want to know how those two features affect the predictions, then you can look at the weights (a.k.a. 'parameters' or 'coefficients') of your trained model. If the weight for feature 1 is x, then the predicted response increases by x when feature 1 increases by 1.0.

要在 Amazon SageMaker 中查看使用线性学习器算法训练的模型的权重,您可以下载 model.tar.gz 工件并在本地打开它.模型工件可以从您在 output 参数中指定的 S3 位置下载到 sagemaker.estimator.Estimator 方法.

To view the weights of a model trained with the linear learner algorithm in Amazon SageMaker, you can download the model.tar.gz artifact and open it locally. The model artifact can be downloaded from the S3 location you specified in the output argument to the sagemaker.estimator.Estimator method.

import os
import mxnet as mx
import boto3

bucket = "<your_bucket>"
key = "<your_model_prefix>"
boto3.resource('s3').Bucket(bucket).download_file(key, 'model.tar.gz')

os.system('tar -zxvf model.tar.gz')

# Linear learner model is itself a zip file, containing a mxnet model and other metadata.
# First unzip the model.
os.system('unzip model_algo-1') 

# Load the mxnet module
mod = mx.module.Module.load("mx-mod", 0)

# model weights
weights = mod._arg_params['fc0_weight'].asnumpy().flatten()

# model bias
bias = mod._arg_params['fc0_bias'].asnumpy().flatten()

# weight for the first feature
weights[0]

这篇关于sagemaker中的逻辑回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆