监督学习中测试数据的目的? [英] Purpose of test data in supervised learning?

查看:128
本文介绍了监督学习中测试数据的目的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以这个问题似乎有点愚蠢,但是我无法解决这个问题. 测试数据的目的是什么?仅仅是计算分类器的准确性吗?我正在使用朴素贝叶斯(Naive Bayes)对推文进行情感分析.使用训练数据训练分类器后,便会使用测试数据来计算分类器的准确性.如何使用测试数据来提高分类器的性能?

So this question may seem a little stupid but I couldn't wrap my head around it. What is the purpose of test data? Is it only to calculate accuracy of the classifier? I'm using Naive Bayes for sentiment analysis of tweets. Once I train my classifier using training data, I use test data just to calculate accuracy of the classifier. How can I use the test data to improve classifier's performance?

推荐答案

在进行一般的监督式机器学习时,测试数据集在确定模型的性能方面起着至关重要的作用.通常,您会使用90%的输入数据来构建模型,而将10%的数据留给测试.然后,您可以查看该模型相对于10%训练集的效果如何,以检查该模型的准确性.该模型针对测试数据的性能是有意义的,因为该模型从未看到"该数据.如果模型在统计上是有效的,那么它应该在训练和测试数据集上都表现良好.此一般过程称为交叉验证,您可以阅读有关它的更多信息

In doing general supervised machine learning, the test data set plays a critical role in determining how well your model is performing. You typically will build a model with say 90% of your input data, leaving 10% aside for testing. You then check the accuracy of that model by seeing how well it does against the 10% training set. The performance of the model against the test data is meaningful because the model has never "seen" this data. If the model be statistically valid, then it should perform well on both the training and test data sets. This general procedure is called cross validation and you can read more about it here.

这篇关于监督学习中测试数据的目的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆