将N列的数据帧转换为两个“堆叠”列的数据帧 [英] Convert data frame of N columns into a data frame of two 'stacked' columns

查看:99
本文介绍了将N列的数据帧转换为两个“堆叠”列的数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Hello Stack Community。

Hello Stack Community.

我正在与网络分析工作,并有一个数据整形问题。

I am doing work with network analytics and have a data reshaping question.

我的原始数据以一系列列的形式出现,每列是源和目标对。最后的数据框架需要由两列源和目标组成。注意这些对是交错的,因为它们的来源和目标被链接在定向网络中。 (见代码示例中所需输出的final_output)

My original data comes in as a series of columns each column being a "source" and "target" pair. The final data frame needs to be made up of two columns "source" and "target". Note these pairs are staggered as they source and targets are linked as in a directed network. (See the final_output in the code example for desired output)

我创建了一个非常麻烦的方法来生成我需要的输出(见下面的代码),但它不适应不同的数字的列没有我添加变量和什么。另外,请注意,在某些情况下,列对的数量将是奇数,即在数据帧结尾没有目标的一个源。在这种情况下,使用NAs创建缺少的目标列。

I created a very hacky method producing the output I need (see code below) but it does not accommodate differing numbers of columns without me adding variables and whatnot. Also, please note in some cases the number of column pairs will be an odd number, i.e. one "source" with no "target" at the end of the data frame. In this case the missing "target" column is created with NAs.

我觉得没有所有的手工制作就有一个平滑的方法。我一直在搜索和搜索,没有遇到任何东西。非常感谢你的帮助。

I feel there is a smooth way to produce this without all the handwork. I have been searching and searching and have not come across anything. Thank you so much for your help.

Tim

# Create example DF
mydf <- data.frame(id = 1:6, varA = "A",
               varB = "B",
               varC = "C",
               varD = "D",
               varE = "E",
               varF = "F")
#Remove the ID value for DF build. This variable is not in real DF
mydf$id <-NULL

#Begin inelegant hack. 
#Please note: the incoming DF has an indeterminate number of columns that vary with project

counter <-ncol(mydf)
   for (i in 1:counter){
   t1 <-mydf[(counter-counter+1):(counter-counter+2)] 
   t2 <-mydf[(counter-counter+2):(counter-counter+3)]
   t3 <-mydf[(counter-counter+3):(counter-counter+4)]
   t4 <-mydf[(counter-counter+4):(counter-counter+5)]
   t5 <-mydf[(counter-counter+5):(counter-counter+6)]
    }

#Rename for the rbind
names(t1) <-c("Source", "Target")
names(t2) <-c("Source", "Target")
names(t3) <-c("Source", "Target")
names(t4) <-c("Source", "Target")
names(t5) <-c("Source", "Target")

#This is the shape I need but the process is super manual and does not accommodate differing numbers of columns.
final_output <-rbind(t1,t2,t3,t4,t5)


推荐答案

如果我理解正确,可以使用 unlist 并手动创建您的 data.frame

If I understand correctly, you can just use unlist and manually create your data.frame:

mydf[] <- lapply(mydf, as.character)  # Convert factors to characters
final_output <- data.frame(Source = unlist(mydf[-length(mydf)]), 
                           Target = unlist(mydf[-1]))
head(final_output, 15)
#       Source Target
# varA1      A      B
# varA2      A      B
# varA3      A      B
# varA4      A      B
# varA5      A      B
# varA6      A      B
# varB1      B      C
# varB2      B      C
# varB3      B      C
# varB4      B      C
# varB5      B      C
# varB6      B      C
# varC1      C      D
# varC2      C      D
# varC3      C      D

这篇关于将N列的数据帧转换为两个“堆叠”列的数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆