逻辑回归-eval(family $ initialize):y值必须为0< = y< = 1 [英] Logistic regression - eval(family$initialize) : y values must be 0 <= y <= 1
问题描述
I am trying to perform logistic regression using R in a dataset provided here : http://archive.ics.uci.edu/ml/machine-learning-databases/00451/ It is about breast cancer. This dataset contains a column Classification which contains only 1 (if patient doesn't have cancer) or 2 (if patient has cancer)
library(ISLR)
dataCancer <- read.csv("~/Desktop/Isep/Machine
Leaning/TD/Project_Cancer/dataR2.csv")
attach(dataCancer)
names(dataCancer)
summary(dataCancer)
cor(dataCancer[,-11])
pairs(dataCancer[,-11])
#Step : Split data into training and testing data
training = (BMI>25)
testing = !training
training_data = dataCancer[training,]
testing_data = dataCancer[testing,]
Classification_testing = Classification[testing]
#Step : Fit a logistic regression model using training data
as.factor(dataCancer$Classification)
classification_model = glm(Classification ~ ., data =
training_data,family = binomial )
summary(classification_model)
运行我的脚本时,我得到:
When running my script I get :
> classification_model = glm(Classification ~ ., data = training_data,family = binomial )
Error in eval(family$initialize) : y values must be 0 <= y <= 1
> summary(classification_model)
Error in summary(classification_model) : object 'classification_model' not found .
我在其他帖子中添加了 as.factor(dataCancer $ Classification),但它没有解决我的问题.如果这是该预测变量的内容,您能否建议我将分类的值设置为0到1之间的方法?感谢您的帮助.
I added as.factor(dataCancer$Classification) as seen in other posts but it has not solved my problem. Can you suggest me a way to have a classification's value between 0 and 1 if it is the content of this predictor? Thanks for your help.
推荐答案
您在脚本中添加了 as.factor(dataCancer $ Classification)
,即使数据集 dataCancer 附加了em>,上面的命令不会将数据集变量分类转换为因子.它只会在控制台上返回一个因子.
You added the as.factor(dataCancer$Classification)
in the script, but even if the dataset dataCancer is attached, a command like the one above does not transform the dataset variable Classification into a factor. It only returns a factor on the console.
由于您要在训练数据集上拟合模型,因此可以指定
Since you want to fit the model on the training dataset, you either specify
training_data$Classification <- as.factor(training_data$Classification)
classification_model <- glm(Classification ~ ., data =
training_data, family = binomial)
或在 glm 行代码中使用 as.factor 函数
classification_model <- glm(as.factor(Classification) ~ ., data =
training_data, family = binomial)
这篇关于逻辑回归-eval(family $ initialize):y值必须为0< = y< = 1的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!