如何在Python上使用PMML文件和Augustus对线性模型进行评分 [英] How to score a linear model using PMML file and Augustus on Python

查看:142
本文介绍了如何在Python上使用PMML文件和Augustus对线性模型进行评分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是python,PMML和augustus的新手,所以这个问题有点新手.我有一个PMML文件,在每次新的数据迭代后,我都希望从该文件中进行评分.我只需要在Augustus中使用Python即可完成本练习.我读了许多文章,其中一些值得一提,因为它们很好.

I am new to python,PMML and augustus,so this question kind of newbie.I have a PMML file from which i want to score after every new iteration of data. I have to use Python with Augustus only to complete this excercise. I have read various articles some of them worth mentioning as they are good.

( http://augustusdocs.appspot.com/docs/v06/model_abstraction/augustus_and_pmml.html http://augustus.googlecode.com/svn-history/r191/trunk/augustus/modellib/regression/producer/Producer.py )

我已经阅读了有关Augustus文档的评分,以了解其工作原理,但是我无法解决此问题.

I have read augustus documentation relevent to scoring to understand how it works,but i am unable to solve this problem.

使用R中的汽车数据生成示例PMML文件,其中"dist"是相关的,"speed"是独立的变量.现在我想每当我从方程式接收速度数据时预测dist(dist = -17.5790948905109 + speed * 3.93240875912408).我知道可以使用带有预测功能的R轻松完成,但是问题是我在后端没有R,只有Python与Augustus一起得分.非常感谢您的任何帮助,并在此先感谢.

A sample PMML file is generated using cars data in R. where "dist" is dependent and "speed" is independent variable. Now i want to predict dist everytime whenever i recieve data for speed from the equation (which is dist = -17.5790948905109 + speed*3.93240875912408) . I know it can be easily done in R with predict function,but the problem is i don't have R at backend and only python is there with augustus to score. Any help is much appreciated and thanks in advance.

示例PMML文件:

     <?xml version="1.0"?>
     <PMML version="4.1" xmlns="http://www.dmg.org/PMML-4_1"     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.dmg.org/PMML-4_1 http://www.dmg.org/v4-1/pmml-4-1.xsd">
         <Header copyright="Copyright (c) 2013 user" description="Linear Regression Model">
          <Extension name="user" value="user" extender="Rattle/PMML"/>
          <Application name="Rattle/PMML" version="1.4"/>
          <Timestamp>2013-11-07 09:24:06</Timestamp>
         </Header>
        <DataDictionary numberOfFields="2">
         <DataField name="dist" optype="continuous" dataType="double"/>
         <DataField name="speed" optype="continuous" dataType="double"/>
        </DataDictionary>
        <RegressionModel modelName="Linear_Regression_Model" functionName="regression"   algorithmName="least squares">
         <MiningSchema>
          <MiningField name="dist" usageType="predicted"/>
          <MiningField name="speed" usageType="active"/>
         </MiningSchema>
         <Output>
          <OutputField name="Predicted_dist" feature="predictedValue"/>
         </Output>
         <RegressionTable intercept="-17.5790948905109">
          <NumericPredictor name="speed" exponent="1" coefficient="3.93240875912408"/>
         </RegressionTable>
        </RegressionModel>
     </PMML>

推荐答案

您可以使用 PyPMML 来在Python中为PMML模型评分,例如:

You could use PyPMML to score the PMML model in Python, for example:

from pypmml import Model

model = Model.fromString('''<?xml version="1.0"?>
     <PMML version="4.1" xmlns="http://www.dmg.org/PMML-4_1"     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.dmg.org/PMML-4_1 http://www.dmg.org/v4-1/pmml-4-1.xsd">
         <Header copyright="Copyright (c) 2013 user" description="Linear Regression Model">
          <Extension name="user" value="user" extender="Rattle/PMML"/>
          <Application name="Rattle/PMML" version="1.4"/>
          <Timestamp>2013-11-07 09:24:06</Timestamp>
         </Header>
        <DataDictionary numberOfFields="2">
         <DataField name="dist" optype="continuous" dataType="double"/>
         <DataField name="speed" optype="continuous" dataType="double"/>
        </DataDictionary>
        <RegressionModel modelName="Linear_Regression_Model" functionName="regression"   algorithmName="least squares">
         <MiningSchema>
          <MiningField name="dist" usageType="predicted"/>
          <MiningField name="speed" usageType="active"/>
         </MiningSchema>
         <Output>
          <OutputField name="Predicted_dist" feature="predictedValue"/>
         </Output>
         <RegressionTable intercept="-17.5790948905109">
          <NumericPredictor name="speed" exponent="1" coefficient="3.93240875912408"/>
         </RegressionTable>
        </RegressionModel>
     </PMML>''')
result = model.predict({'speed': 1.0})

结果是带有Predicted_dist的字典:

The result is a dict with Predicted_dist:

{'Predicted_dist': -13.646686131386819}

这篇关于如何在Python上使用PMML文件和Augustus对线性模型进行评分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆