如何在vowpal wabbit中重新训练文件序列模型 [英] how to retrain the model for sequence of files in vowpal wabbit

查看:92
本文介绍了如何在vowpal wabbit中重新训练文件序列模型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试对一组文件(截至目前大约10个)运行vowpal wabbit.我的实验如下:

  1. 将第一个火车文件转换为VW格式

  2. 使用第一个训练文件训练大众汽车模型​​并存储模型.

  3. 使用存储的模型验证测试文件的准确性

  4. 现在获取第二个文件,将其转换为VW格式,并使用第二个文件重新训练步骤2中存储的模型,并存储更新的模型

  5. 在重新训练的模型上验证测试文件并报告准确性.

  6. 使用for循环对其余文件集重复步骤4-5(每次迭代中的测试文件相同)

当我做这个实验时,我遇到了一些错误.在这里,我粘贴了训练,重新训练和验证命令以及错误.

请问有没有人能帮助我重现此场景而没有出现任何错误.

命令:

这里"i"的范围是1到10

$ idec = i -1(先前模型的索引)

vw -d $ {i} _processed_binary_compressed.vw --loss_function logistic -i $ {idec} _processed_binary_compressed.model.vw --quiet --save_resume -f $ {i} _processed_binary_compressed.model.vw

回声 回显"day_ $ i的模型训练已完成"

回显"$ {i} _day模型验证正在进行中……" 回声

vw 10_processed_binary_compressed_test.vw -t -i $ {i} _processed_binary_compressed.model.vw --quiet --hash字符串-p 10_processed_binary_compressed_test _ $ {i} _day_result.csv -r 10_processed_binary_compressed_test _ $ {i} _day >

错误:

vw:不能多次指定选项'--data'

解决方案

我无法复制该问题(但TOC_cmi要求粘贴我使用过的普通字体):

git clone https://github.com/JohnLangford/vowpal_wabbit.git
cd vowpal_wabbit
make
cd test/train-sets

vw -d rcv1_smaller.dat --loss_function=logistic --save_resume -f day1.model
vw -d rcv1_small.dat --loss_function=logistic --save_resume -i day1.model -f day2.model
vw -t -d rcv1_smaller.dat --loss_function=logistic -i day2.model -p day2.predictions -r day2.raw

I am trying to run the vowpal wabbit on a set of files(approximately 10 as of now). My experiment is as follows:

  1. Convert the first train file to VW format

  2. Train the VW model with this first training file and store the model.

  3. Validate the accuracy on the test file with stored model

  4. Now take the second file convert it to VW format and retrain the model stored in step 2 with this second file and store the updated model

  5. Validate the test file on retrained model and report the accuracy.

  6. Repeat steps 4-5 for remaining set of files using for loop(test file is same in each iteration)

When I did this experiment I got some error. Here I am pasting train, retrain and validation commands as well error.

Can any of you please helps me in reproducing this scenario without getting any error.

Commands:

here 'i' is ranging from 1 to 10

$idec = i -1(index of previous model)

vw -d ${i}_processed_binary_compressed.vw --loss_function logistic -i ${idec}_processed_binary_compressed.model.vw --quiet --save_resume -f ${i}_processed_binary_compressed.model.vw

echo echo "Model trainiing completed for day_$i"

echo "${i}_day model validation is under progress..." echo

vw 10_processed_binary_compressed_test.vw -t -i ${i}_processed_binary_compressed.model.vw --quiet --hash strings -p 10_processed_binary_compressed_test_${i}_day_result.csv -r 10_processed_binary_compressed_test_${i}_day_raw.txt

error:

vw: option '--data' cannot be specified more than once

解决方案

I cannot replicate the problem (but TOC_cmi asked to paste the commonads I used):

git clone https://github.com/JohnLangford/vowpal_wabbit.git
cd vowpal_wabbit
make
cd test/train-sets

vw -d rcv1_smaller.dat --loss_function=logistic --save_resume -f day1.model
vw -d rcv1_small.dat --loss_function=logistic --save_resume -i day1.model -f day2.model
vw -t -d rcv1_smaller.dat --loss_function=logistic -i day2.model -p day2.predictions -r day2.raw

这篇关于如何在vowpal wabbit中重新训练文件序列模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆