从JPMML模型的InputField获取实际的字段名称 [英] Get actual field name from JPMML model's InputField

查看:211
本文介绍了从JPMML模型的InputField获取实际的字段名称的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个使用JPMML的java应用程序中使用的scikit模型.我正在尝试使用在培训期间使用的列的名称来设置InputFields,但是"inField.getName().getValue()"被混淆为"x {#}".无论如何,我可以将"x {#}"映射回原始功能/属性名称吗?

I have a scikit model that I'm using in my java app using JPMML. I'm trying to set the InputFields using the name of the column that was used during training, but "inField.getName().getValue()" is obfuscated to "x{#}". Is there anyway i could map "x{#}" back to the original feature/attribute name?

Map<FieldName, FieldValue> arguments = new LinkedHashMap<>();
    or (InputField inField : patternEvaluator.getInputFields()) {
        int value = activeFeatures.contains(inField.getName().getValue()) ? 1 : 0;
        FieldValue inputFieldValue = inField.prepare(value);
        arguments.put(inField.getName(), inputFieldValue);              
            }
Map<FieldName, ?> results = patternEvaluator.evaluate(arguments);

这是我生成模态的方式

from sklearn2pmml import PMMLPipeline
from sklearn2pmml import PMMLPipeline
import os
import pandas as pd
from sklearn.pipeline import Pipeline
import numpy as np

data = pd.read_csv('/pydata/training.csv')
X = data[data.keys()[:-1]].as_matrix()
y = data['classname'].as_matrix()

X_train, X_test, y_train, y_test =    train_test_split(X,y,test_size=0.3,random_state=0)

estimators = [("read", RandomForestClassifier(n_jobs=5,n_estimators=200, max_features='auto'))]    
pipe = PMMLPipeline(estimators)
pipe.fit(X_train,y_train)
pipe.active_fields = np.array(data.columns)
sklearn2pmml(pipe, "/pydata/model.pmml", with_repr = True)

谢谢

推荐答案

PMML文档是否完全包含实际的字段名称?在文本编辑器中将其打开,然后查看/PMML/DataDictionary/DataField@name属性的值是什么.

Does the PMML document contain actual field names at all? Open it in a text editor, and see what are the values of /PMML/DataDictionary/DataField@name attributes.

您的问题表明从Scikit-Learn到PMML的转换是不完整的,因为它不包含有关活动字段(也称为输入字段)名称的信息.在这种情况下,它们被假定为x1x2,..,xn.

Your question indicates that the conversion from Scikit-Learn to PMML was incomplete, because it didn't include information about active field (aka input field) names. In that case they are assumed to be x1, x2, .., xn.

这篇关于从JPMML模型的InputField获取实际的字段名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆