使用Python对CSV文件进行训练测试拆分 [英] Train-test Split of a CSV file in Python

查看：681 发布时间：2020/5/4 10:28:22 python python-3.x machine-learning data-science

本文介绍了使用Python对CSV文件进行训练测试拆分的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个.csv文件，其中包含我的数据.我想做Logistic Regression，Naive Bayes和Decision Trees.我已经知道如何实现这些.

I have a .csv file that contains my data. I would like to do Logistic Regression, Naive Bayes and Decision Trees. I already know how to implement these.

但是，我的老师希望我将.csv文件中的数据拆分为80%，并让我的算法预测其他20%.我想知道如何以这种方式实际分割数据.

However, my teacher wants me to split the data in my .csv file into 80% and let my algorithms predict the other 20%. I would like to know how to actually split the data in that way.

diabetes_df = pd.read_csv("diabetes.csv")
diabetes_df.head()

with open("diabetes.csv", "rb") as f:
    data = f.read().split()
    train_data = data[:80]
    test_data = data[20:]

我试图像这样分割它(确保它不起作用).

I tried to split it like this (sure it isn't working).

工作流程

加载数据(请参阅如何使用Python读写CSV文件? )
预处理数据(例如过滤/创建新功能)
对火车测试(验证和开发集)进行分组

Load the data (see How do I read and write CSV files with Python? )
Preprocess the data (e.g. filtering / creating new features)
Make the train-test (validation and dev-set) split

代码

Sklearns sklearn.model_selection.train_test_split 是你的意思正在寻找:

Code

Sklearns sklearn.model_selection.train_test_split is what you are looking for:

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.33, random_state=0)

这篇关于使用Python对CSV文件进行训练测试拆分的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用Python对CSV文件进行训练测试拆分 [英] Train-test Split of a CSV file in Python

问题描述

推荐答案

工作流程

代码

Code

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

使用Python对CSV文件进行训练测试拆分 [英] Train-test Split of a CSV file in Python

问题描述

推荐答案

工作流程

代码

Code

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭