如何找到将数据拆分为测试和训练的最佳值? [英] How to find the optimal values for splitting the data into test and train?

查看：49 发布时间：2021/7/7 18:55:58 python pandas regression

本文介绍了如何找到将数据拆分为测试和训练的最佳值?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在构建一个 python 应用程序，我想在其中预测一个月内 PM2.5 的值.我正在使用多项式回归，并训练了算法将数据拆分为 30% 的测试数据和 70% 的训练数据.我正在使用这行代码来训练算法:

I am building a python application in which i want to forecast the values of PM2.5 over a month. I am using polynomial regression and I have trained the algorithm to split data into 30%test data and 70%train data. I am using this line of code to train the algorithm:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42,shuffle=True)

但我注意到，如果我给 random_state 不同的整数，则均方误差和预测的准确性也会不同.如何找到 train_test_split 方法的最佳参数以使预测具有最高准确度?

But i have noticed that if i give the random_state different integers, the mean squared error differs and also the accuracy of the forecast. How can I find the optimal parameters for the train_test_split method so that the forecast has the most accuracy?

如何找到将数据拆分为测试和训练的最佳值? [英] How to find the optimal values for splitting the data into test and train?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何找到将数据拆分为测试和训练的最佳值? [英] How to find the optimal values for splitting the data into test and train?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭