在PyStruct中拟合SSVM模型时出现IndexError [英] IndexError when fitting SSVM model in PyStruct
问题描述
在对讨论线程中的帖子进行分类时,我正在使用pystruct
Python模块解决结构化学习问题,并且在尝试培训OneSlackSSVM
与LinearChainCRF
结合使用时遇到了一个问题.我正在遵循文档中的 OCR示例,但似乎无法调用SSVM上的.fit()
方法.这是我得到的错误:
I'm using the pystruct
Python module for a structured learning problem in classifying posts in discussion threads, and I've run into an issue when tying to train the OneSlackSSVM
for use with the LinearChainCRF
. I'm following the OCR example from the docs, but can't seem to call the .fit()
method on the SSVM. Here is the error I'm getting:
Traceback (most recent call last):
File "<ipython-input-47-da804d135818>", line 1, in <module>
ssvm.fit(X_train, y_train)
File "/Users/kylefth/anaconda/lib/python2.7/site-
packages/pystruct/learners/one_slack_ssvm.py", line 429, in fit
joint_feature_gt = self.model.batch_joint_feature(X, Y)
File "/Users/kylefth/anaconda/lib/python2.7/site-
packages/pystruct/models/base.py", line 40, in batch_joint_feature
joint_feature_ += self.joint_feature(x, y)
File "/Users/kylefth/anaconda/lib/python2.7/site-
packages/pystruct/models/graph_crf.py", line 197, in joint_feature
unary_marginals[gx, y] = 1
IndexError: index 7 is out of bounds for axis 1 with size 7
下面是我编写的代码.我已经厌倦了像docs示例中那样构造数据,其中的整体数据结构是dict
,其中键分别是data
,labels
和folds
.
Below is the code I've written. I've tired to structure the data as in the docs example where the overall data structure is a dict
with keys for data
, labels
, and folds
.
from pystruct.models import LinearChainCRF
from pystruct.learners import OneSlackSSVM
# Printing out keys of overall data structure
print threads.keys()
>>> ['folds', 'labels', 'data']
# Creating instances of models
crf = LinearChainCRF()
ssvm = OneSlackSSVM(model=crf)
# Splitting up data into training and test sets as in example
X, y, folds = threads['data'], threads['labels'], threads['folds']
X_train, X_test = X[folds == 1], X[folds != 1]
y_train, y_test = y[folds == 1], y[folds != 1]
# Print out dimensions of first element in data and labels
print X[0].shape, y[0].shape
>>> (8, 211), (8,)
# Fitting the ssvm model
ssvm.fit(X_train, y_train)
>>> see error above
直接尝试拟合模型后,出现上述错误. X_train
,X_test
,y_train
和y_test
的所有实例都有211列,并且所有标签尺寸似乎都与其相应的训练和测试数据相匹配.任何帮助将不胜感激.
Directly after trying to fit the model, I get the above error. All instances of X_train
, X_test
, y_train
, and y_test
have 211 columns and all the label dimensions appear to match up with their corresponding training and testing data. Any help would be greatly appreciated.
推荐答案
我认为您所做的一切都是正确的,这是 https://github.com/pystruct/pystruct/issues/114 . 您的标签y必须从0到n_labels开始.我想你的从1开始.
I think everything you are doing is right, this is https://github.com/pystruct/pystruct/issues/114. Your labels y need to start from 0 to n_labels. I think yours start at 1.
这篇关于在PyStruct中拟合SSVM模型时出现IndexError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!