如何在R中将多列转换为单独的行 [英] How to convert multiple columns to individual rows in R

查看:128
本文介绍了如何在R中将多列转换为单独的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在R中有一个数据帧,其中有许多行(超过3000行),其中有发声的F0(基本频率)轨迹.这些行中包含以下信息:说话者ID,组号,重复号,口音类型,性别,然后是50列F0点.数据如下:

I have a data frame in R that has many rows (over 3000) with F0 (fundamental frequency) tracks of an utterance in it. The rows have the following information in them: speaker ID, group #, repetition #, accent type, sex, and then 50 columns of F0 points. The data looks like this:

Speaker Sex Group Repetition Accent    Word         1         2         3        4
    105   M     1          1      N AILMENT 102.31030 102.31030 102.31030 102.31127 
    105   M     1          1      N COLLEGE 111.80641 111.80313 111.68612 111.36020
    105   M     1          1      N  FATHER 124.06655 124.06655 124.06655 124.06655 

但是它不仅要去X4,而且每行有50个点,所以我有3562x56的数据帧.我想对其进行更改,以便F0轨道中的每一列数据(因此字后从1:50开始)都具有自己的列,并将关联的列号作为另一行.我想将所有信息以及每个数据点都保留在前六列中,所以看起来像这样:

But instead of only going to X4, it has 50 points per row, so I have a 3562x56 data frame. I want to change it so each column of data in the F0 track (so after word, from 1:50) gets its own column, with the associated column number as another row. I want to keep all of the information in the first six columns with each data point as well, so it would look like this:

Speaker Sex Group Repetition Accent    Word       Num        F0
    105   M     1          1      N AILMENT         1 102.31030
    105   M     1          1      N AILMENT         2 102.31030
    105   M     1          1      N AILMENT         3 102.31030
    105   M     1          1      N AILMENT         4 102.31127
    ...
    105   M     1          1      N COLLEGE         1 111.80641 
    105   M     1          1      N COLLEGE         1 111.80313 
    105   M     1          1      N COLLEGE         1 111.68612 
    105   M     1          1      N COLLEGE         1 111.36020 
    ...

虽然很乏味,但我尝试使用的代码如下:

The code I tried to use, while tedious, is as follows:

x = 1
for (i in 1:dim(normrangef0)[1]) {
     for (j in 1:50) {
             norm.all$Speaker[x] <- normrangef0$Speaker[i]
             norm.all$Sex[x] <- normrangef0$Sex[i]
             norm.all$Group[x] <- normrangef0$Group[i]
             norm.all$Repetition[x] <- normrangef0$Repetition[i]
             norm.all$Word[x] <- normrangef0$Word[i]
             norm.all$Accent[x] <- normrangef0$Accent[i]
             norm.all$Time[x] <- j
             norm.all$F0[x] <- normrangef0[i,j+6]
             x = x+1    
    }
}

但是,当我使用norm.all作为NULL对象(仅由norm.all = c()定义)来执行此操作时,最终得到超过200k项的列表,其中许多都是NA.当我将norm.all定义为数据帧(在178100x8数据帧中为空1或全0之一时,我得到一个错误:

However, when I do this with norm.all as a NULL object (just defined by norm.all = c() ), I end up with a list of over 200k items, many of which are NAs. When I define norm.all as a data frame (either an empty one or one of all 0s, in the 178100x8 data frame, I get an error:

$<-.data.frame(*tmp*,扬声器",值= 105L)中的错误:替换有1行,数据有0

Error in $<-.data.frame(*tmp*, "Speaker", value = 105L) : replacement has 1 row, data has 0

我的代码是否已完全关闭?还有另一种方法吗?

Is my code just totally off? Is there another way to do this?

推荐答案

使用"reshape2"中的melt

Use melt from "reshape2"

library(reshape2)
melt(mydf, id.vars=c("Speaker", "Sex", "Group", "Repetition", "Accent", "Word"))
#    Speaker Sex Group Repetition Accent    Word variable    value
# 1      105   M     1          1      N AILMENT        1 102.3103
# 2      105   M     1          1      N COLLEGE        1 111.8064
# 3      105   M     1          1      N  FATHER        1 124.0666
# 4      105   M     1          1      N AILMENT        2 102.3103
# 5      105   M     1          1      N COLLEGE        2 111.8031
# 6      105   M     1          1      N  FATHER        2 124.0666
# 7      105   M     1          1      N AILMENT        3 102.3103
# 8      105   M     1          1      N COLLEGE        3 111.6861
# 9      105   M     1          1      N  FATHER        3 124.0666
# 10     105   M     1          1      N AILMENT        4 102.3113
# 11     105   M     1          1      N COLLEGE        4 111.3602
# 12     105   M     1          1      N  FATHER        4 124.0666


在基数R中,您还可以使用stack堆叠名称为1至4的列,并使用cbind堆叠第一组列.另外,unlist也会这样做.


In base R, you can also use stack to stack the columns named 1 through 4, and cbind that with the first group of columns. Alternatively, unlist will also do this.

您可能还需要研究"data.table"包,以提高速度.

You may also want to look into the "data.table" package to get a bit of a speed boost.

这篇关于如何在R中将多列转换为单独的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆