将.data文件转换为.csv [英] convert .data file to .csv

查看:1653
本文介绍了将.data文件转换为.csv的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我发现以下数据集名为ecoli.data,可在:



https://archive.ics.uci.edu/ml/machine-learning-databases/ecoli/



我想在R中打开它进行分类任务,但我更愿意将此文档转换为csv文件。当我用字我打开它,我注意到没有制表符分隔,因为每行之间都有树形空格;所以底线的问题是如何使用Excel或者Python来将这个文件转换成csv。

解决方案

这里有两种实际做法在R(工作)中:

  library(readr)

url< - https: //archive.ics.uci.edu/ml/machine-learning-databases/ecoli/ecoli.data

with base R

  df<  -  read.table(url)
dplyr :: glimpse(df )

##观察值:336
##变量:
## $ V1(fctr)AAT_ECOLI,ACEA_ECOLI,ACEK_ECOLI,ACKA_ECOLI,ADI_ECOLI,...
## $ V2(dbl)0.49,0.07,0.56,0.59,0.23,0.67,0.29,0.21,0.20,0.42 ...
## $ V3(dbl)0.29,0.40,0.40,0.49,0.32 ,0.39,0.28,0.34,0.44,0.40,...
## $ V4(dbl)0.48,0.48,0.48,0.48,0.48,0.48,0.48,0.48,0.48,0.48,...
## $ V5(dbl)0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,...
## $ V6(dbl)0.56,0.54 ,0.49,0.52,0.55,0.36,0.44,0.51,0.46,0.56,...
## $ V7(dbl)0.24,0.35,0.37,0.45,0.25,0.38,0.23,0.28,0.51,0.18,...
## $ V8(dbl)0.35,0.44, 0.46,0.36,0.35,0.46,0.34,0.39,0.57,0.30,...
## $ V9(fctr)cp,cp,cp,cp,cp,cp,cp,cp,cp,cp, cp,cp,cp,cp,cp ...

write.csv(df,ecoli.csv,row.names = FALSE)
/ pre>

readr 功能

  df<  -  read_table(url,col_names = FALSE)
dplyr :: glimpse(df)

##观察值:336
##变量:
## $ X1(chr)AAT_ECOLI,ACEA_ECOLI,ACEK_ECOLI,ACKA_ECOLI,ADI ...
## $ X2(dbl)0.49,0.07,0.56, 0.59,0.23,0.67,0.29,0.21,0.20,0.42 ...
## $ X3(dbl)0.29,0.40,0.40,0.49,0.32,0.39,0.28,0.34,0.44,0.40,...
## $ X4(dbl)0.48,0.48,0.48,0.48,0.48,0.48,0.48,0.48,0.48,0.48,...
## $ X5(dbl)0.5,0.5, 0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,...
## $ X6(dbl)0.56,0.54,0.49,0.52,0.55,0.36,0.45,0.51, 0.46,0。 56,...
## $ X7(dbl)0.24,0.35,0.37,0.45,0.25,0.38,0.23,0.28,0.51,0.18,...
## $ X8(dbl) 0.35,0.44,0.46,0.36,0.35,0.46,0.34,0.39,0.57,0.30,...
## $ X9(chr)cp,cp,cp,cp cp,cp,cp,cp,cp,cp,...

write_csv(df,ecoli.csv)


I have found the following data set named ecoli.data and available in:

https://archive.ics.uci.edu/ml/machine-learning-databases/ecoli/

I would like to open it in R for making a classification task, but I would prefer to convert this document into a csv file. When I open it in word I notice that is not tab delimited, because there are like tree spaces between each row; so bottomline question is how to convert this file into csv using Excel or maybe Python.

解决方案

Here are two ways to actually do that in R (that work):

library(readr)

url <- "https://archive.ics.uci.edu/ml/machine-learning-databases/ecoli/ecoli.data"

with base R

df <- read.table(url)
dplyr::glimpse(df)

## Observations: 336
## Variables:
## $ V1 (fctr) AAT_ECOLI, ACEA_ECOLI, ACEK_ECOLI, ACKA_ECOLI, ADI_ECOLI, ...
## $ V2 (dbl) 0.49, 0.07, 0.56, 0.59, 0.23, 0.67, 0.29, 0.21, 0.20, 0.42,...
## $ V3 (dbl) 0.29, 0.40, 0.40, 0.49, 0.32, 0.39, 0.28, 0.34, 0.44, 0.40,...
## $ V4 (dbl) 0.48, 0.48, 0.48, 0.48, 0.48, 0.48, 0.48, 0.48, 0.48, 0.48,...
## $ V5 (dbl) 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,...
## $ V6 (dbl) 0.56, 0.54, 0.49, 0.52, 0.55, 0.36, 0.44, 0.51, 0.46, 0.56,...
## $ V7 (dbl) 0.24, 0.35, 0.37, 0.45, 0.25, 0.38, 0.23, 0.28, 0.51, 0.18,...
## $ V8 (dbl) 0.35, 0.44, 0.46, 0.36, 0.35, 0.46, 0.34, 0.39, 0.57, 0.30,...
## $ V9 (fctr) cp, cp, cp, cp, cp, cp, cp, cp, cp, cp, cp, cp, cp, cp, cp...

write.csv(df, "ecoli.csv", row.names=FALSE)

with readr functions

df <- read_table(url, col_names=FALSE)
dplyr::glimpse(df)

## Observations: 336
## Variables:
## $ X1 (chr) "AAT_ECOLI", "ACEA_ECOLI", "ACEK_ECOLI", "ACKA_ECOLI", "ADI...
## $ X2 (dbl) 0.49, 0.07, 0.56, 0.59, 0.23, 0.67, 0.29, 0.21, 0.20, 0.42,...
## $ X3 (dbl) 0.29, 0.40, 0.40, 0.49, 0.32, 0.39, 0.28, 0.34, 0.44, 0.40,...
## $ X4 (dbl) 0.48, 0.48, 0.48, 0.48, 0.48, 0.48, 0.48, 0.48, 0.48, 0.48,...
## $ X5 (dbl) 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,...
## $ X6 (dbl) 0.56, 0.54, 0.49, 0.52, 0.55, 0.36, 0.44, 0.51, 0.46, 0.56,...
## $ X7 (dbl) 0.24, 0.35, 0.37, 0.45, 0.25, 0.38, 0.23, 0.28, 0.51, 0.18,...
## $ X8 (dbl) 0.35, 0.44, 0.46, 0.36, 0.35, 0.46, 0.34, 0.39, 0.57, 0.30,...
## $ X9 (chr) "cp", "cp", "cp", "cp", "cp", "cp", "cp", "cp", "cp", "cp",...

write_csv(df, "ecoli.csv")

这篇关于将.data文件转换为.csv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆