如何将文件的多行读入数据帧的一行 [英] How to read multiple lines of a file into one row of a dataframe
问题描述
我有一个数据文件,其中的各个样本用空白行分隔,每个字段都位于其自己的行上:
I have a data file where individual samples are seperated by a blank line and each field is on it's own line:
age 20
weight 185
height 72
age 87
weight 109
height 60
age 15
weight 109
height 58
...
如何将这个文件读入数据框中,使每一行代表一个带有年龄,体重,身高的列的样本?
How can I read this file into a dataframe such that each row represents a sample with columns of age, weight, height?
age weight height
1 20 185 72
2 87 109 60
3 15 109 58
...
推荐答案
@ user1317221_G显示了我会采用的方法,但是诉诸于加载额外的程序包并显式生成组.组(ID变量)是使任何reshape
类型答案起作用的关键.矩阵答案没有这个限制.
@user1317221_G showed the approach I would take, but resorted to loading an extra package and explicitly generating the groups. The groups (the ID variable) is the key to getting any reshape
type answer to work. The matrix answers don't have that limitation.
这是基数R中与之密切相关的方法:
Here's a closely related approach in base R:
mydf <- read.table(header = FALSE, stringsAsFactors=FALSE,
text = "age 20
weight 185
height 72
age 87
weight 109
height 60
age 15
weight 109
height 58
")
# Create your id variable
mydf <- within(mydf, {
id <- ave(V1, V1, FUN = seq_along)
})
使用id变量,您的转换很容易:
With an id variable, your transformation is easy:
reshape(mydf, direction = "wide",
idvar = "id", timevar="V1")
# id V2.age V2.weight V2.height
# 1 1 20 185 72
# 4 2 87 109 60
# 7 3 15 109 58
或者:
# Your ids become the "rownames" with this approach
as.data.frame.matrix(xtabs(V2 ~ id + V1, mydf))
# age height weight
# 1 20 72 185
# 2 87 60 109
# 3 15 58 109
这篇关于如何将文件的多行读入数据帧的一行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!