插入符号反对结果标签:错误:至少一个类级别不是有效的R变量名称 [英] Caret objecting to outcomes labels: Error: At least one of the class levels is not a valid R variable name
问题描述
caret
给我下面的错误.我正在训练SVM以从一袋单词开始进行预测,但是想使用插入符号来调整C参数,但是:
caret
gives me the error below. I'm training a SVM for prediction starting from a bag of words and wanted to use caret to tune the C parameter, however:
bow.model.svm.tune <- train(Training.match ~ ., data = data.frame(
Training.match = factor(Training.Data.old$Training.match, labels = c('no match', 'match')),
Text.features.dtm.df) %>%
filter(Training.Data.old$Data.tipe == 'train'),
method = 'svmRadial',
tuneLength = 9,
preProc = c("center","scale"),
metric="ROC",
trControl = trainControl(
method="repeatedcv",
repeats = 5,
summaryFunction = twoClassSummary,
classProbs = T))
错误:至少一个类级别不是有效的R变量名;生成类概率时,这将导致错误 因为变量名称将被转换为no.match,match. 请使用可以用作有效R变量名称的因子水平 (请参阅?make.names获得帮助).
Error: At least one of the class levels is not a valid R variable name; This will cause errors when class probabilities are generated because the variables names will be converted to no.match, match . Please use factor levels that can be used as valid R variable names (see ?make.names for help).
原始的e1071::svm()
函数不会出现问题,因此我认为在调整阶段会出现错误:
The original e1071::svm()
function doesn't give problems, therefore I suppose the error arise in the tuning phase:
bow.model.svm.tune <- svm(Training.match ~ ., data = data.frame(
Training.match = factor(Training.Data.old$Training.match, labels = c('no match', 'match')),
Text.features.dtm.df) %>%
filter(Training.Data.old$Data.tipe == 'train'))
数据只是一个结果因子变量,是TfIdf转换后的单词向量的列表:
The data is simply an outcome factor variable and list of TfIdf transformed words vectors:
'data.frame': 1796 obs. of 1697 variables:
$ Training.match : Factor w/ 2 levels "no match","match": 2 1 1 1 1 1 1 1 2 1 ...
$ azienda : num 0.12 0 0 0 0 ...
$ bus : num 0.487 0 0 0 0 ...
$ locale : num 0.275 0 0 0 0 ...
$ martini : num 0.852 0.741 0.947 0.947 0.501 ...
$ osp : num 0.339 0 0 0 0 ...
$ ospedale : num 0.0389 0.0676 0.0864 0.0864 0.0915 ...
推荐答案
进行预测时(内部使用train
或自己使用predict.train
),函数将为每个类别概率创建新的列.如果您的代码需要一个名为"no match"
的列,它将不会看到"no.match"
(这是data.frame
将其转换为的内容),并且将引发错误.
When predicting (internally using train
or using predict.train
yourself), the functions make new columns for each class probability. If your code expects a column called "no match"
it won't see "no.match"
(which is what data.frame
converts it to) and will throw an error.
这篇关于插入符号反对结果标签:错误:至少一个类级别不是有效的R变量名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!