在PyStruct中拟合SSVM模型时出现IndexError [英] IndexError when fitting SSVM model in PyStruct

查看:83
本文介绍了在PyStruct中拟合SSVM模型时出现IndexError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在对讨论线程中的帖子进行分类时,我正在使用pystruct Python模块解决结构化学习问题,并且在尝试培训OneSlackSSVMLinearChainCRF结合使用时遇到了一个问题.我正在遵循文档中的 OCR示例,但似乎无法调用SSVM上的.fit()方法.这是我得到的错误:

I'm using the pystruct Python module for a structured learning problem in classifying posts in discussion threads, and I've run into an issue when tying to train the OneSlackSSVM for use with the LinearChainCRF. I'm following the OCR example from the docs, but can't seem to call the .fit() method on the SSVM. Here is the error I'm getting:

Traceback (most recent call last):

File "<ipython-input-47-da804d135818>", line 1, in <module>
ssvm.fit(X_train, y_train)

File "/Users/kylefth/anaconda/lib/python2.7/site-  
packages/pystruct/learners/one_slack_ssvm.py", line 429, in fit
joint_feature_gt = self.model.batch_joint_feature(X, Y)

File "/Users/kylefth/anaconda/lib/python2.7/site-       
packages/pystruct/models/base.py", line 40, in batch_joint_feature      
joint_feature_ += self.joint_feature(x, y)

File "/Users/kylefth/anaconda/lib/python2.7/site-    
packages/pystruct/models/graph_crf.py", line 197, in joint_feature
unary_marginals[gx, y] = 1

IndexError: index 7 is out of bounds for axis 1 with size 7

下面是我编写的代码.我已经厌倦了像docs示例中那样构造数据,其中的整体数据结构是dict,其中键分别是datalabelsfolds.

Below is the code I've written. I've tired to structure the data as in the docs example where the overall data structure is a dict with keys for data, labels, and folds.

from pystruct.models import LinearChainCRF
from pystruct.learners import OneSlackSSVM

# Printing out keys of overall data structure
print threads.keys()
>>> ['folds', 'labels', 'data']

# Creating instances of models
crf = LinearChainCRF()
ssvm = OneSlackSSVM(model=crf)

# Splitting up data into training and test sets as in example
X, y, folds = threads['data'], threads['labels'], threads['folds']
X_train, X_test = X[folds == 1], X[folds != 1]
y_train, y_test = y[folds == 1], y[folds != 1]

# Print out dimensions of first element in data and labels
print X[0].shape, y[0].shape
>>> (8, 211), (8,)

# Fitting the ssvm model
ssvm.fit(X_train, y_train)
>>> see error above

直接尝试拟合模型后,出现上述错误. X_trainX_testy_trainy_test的所有实例都有211列,并且所有标签尺寸似乎都与其相应的训练和测试数据相匹配.任何帮助将不胜感激.

Directly after trying to fit the model, I get the above error. All instances of X_train, X_test, y_train, and y_test have 211 columns and all the label dimensions appear to match up with their corresponding training and testing data. Any help would be greatly appreciated.

推荐答案

我认为您所做的一切都是正确的,这是 https://github.com/pystruct/pystruct/issues/114 . 您的标签y必须从0到n_labels开始.我想你的从1开始.

I think everything you are doing is right, this is https://github.com/pystruct/pystruct/issues/114. Your labels y need to start from 0 to n_labels. I think yours start at 1.

这篇关于在PyStruct中拟合SSVM模型时出现IndexError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆