解释Vowpal Wabbit的基本输出 [英] Interpreting basic output from Vowpal Wabbit

查看:105
本文介绍了解释Vowpal Wabbit的基本输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对简单的大众汽车的输出有几个疑问.我已经阅读了互联网和Wiki网站,但是仍然不确定一些基本知识.

我对波士顿住房数据进行了以下操作:

vw -d housing.vm --progress 1

其中的housing.vm文件设置为(部分):

和输出(部分):

问题1:

1)按照以下步骤考虑平均损失列是否正确:

a)预测为零,因此第一个平均损失是第一个示例的平方误差(预测为零)

b)在示例1上建立模型并预测示例2.对现在的2平方损失进行平均

c)在示例1-2上建立模型并预测示例3.对现在的3平方损失进行平均

d)...

执行此操作,直到达到数据末尾(假设一次通过)

2)当前的功能列是什么?它似乎是非零要素的数量+截距.示例中显示的内容表明,如果特征为零,则不计算特征-是真的吗?例如,第二条记录的"ZN"值为零.大众真的将数字功能视为缺失吗?

解决方案

您的陈述基本上是正确的.默认情况下,大众汽车会进行在线学习,因此在步骤c中,它将获取当前模型(权重)并使用当前示例进行更新(而不是再次从所有先前的示例中学习).

如您所想,当前的 features列是当前示例的(非零)特征数量.除非您指定--noconstant,否则拦截功能将自动包括在内.

缺失要素与零值要素之间没有区别.两者都意味着您将不会更新相应的权重.

I had a couple questions about the output from a simple run of VW. I have read around the internet and the wiki sites but am still unsure about a couple of basic things.

I ran the following on the boston housing data:

vw -d housing.vm --progress 1

where the housing.vm file is set up as (partially):

and output is (partially):

Question 1:

1) Is it correct to think about the average loss column as the following steps:

a) predict zero, so the first average loss is the squared error of the first example (with the prediction as zero)

b) build a model on example 1 and predict example 2. Average the now 2 squared losses

c) build a model on example 1-2 and predict example 3. Average the now 3 squared losses

d) ...

Do this until you hit the end of the data (assuming a single pass)

2) What is the current features columns? It appears to be the number of non-zero features + an intercept. What is shown in the example, suggests that a feature is not counted if it is zero - is that true? For instance, the second record has a value of zero for 'ZN'. Does VW really look at that numeric feature as missing??

解决方案

Your statements are basically correct. By default, VW does online learning, so in step c it takes the current model (weights) and updates it with the current example (rather than learning from all the previous examples again).

As you supposed, the current features column is the number of (non-zero) features for the current example. The intercept feature is included automatically, unless you specify --noconstant.

There is no difference between a missing feature and a feature with zero value. Both means that you won't update the corresponding weight.

这篇关于解释Vowpal Wabbit的基本输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆