连接数据框中的行 [英] Concatenate rows in a dataframe

查看:58
本文介绍了连接数据框中的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个结构如下的数据框:

I have a dataframe structured like below:

Column A  Column B

1          A  
1          B  
1          C  
1          D  
2          B  
2          C  
2          D  
2          E 

我想将A列中属于特定值的所有行连接起来.

I want to concatenate all the rows belonging to a specific value in column A.

我希望最终输出看起来像这样:

I want the final output to look something like this:

Column A Column B Column C  
1        A        ABCD    
1        B        ABCD  
1        C        ABCD  
1        D        ABCD  
2        B        BCDE  
2        C        BCDE  
2        D        BCDE  
2        E        BCDE   

我将如何在R/Python中执行此操作?

How would I perform this operation in R/ Python?

谢谢

推荐答案

R 中,我们可以使用 dplyr .按"ColumnA"分组后,粘贴"ColumnB"的内容,并使用 mutate

In R, we can use dplyr. After grouping by 'ColumnA', paste the contents of 'ColumnB' and create a new column with mutate

library(dplyr)
df1 %>%
     group_by(ColumnA) %>% 
     mutate(ColumnC = paste(ColumnB, collapse=""))
# A tibble: 8 x 3
# Groups:   ColumnA [2]
#  ColumnA ColumnB ColumnC
#    <int>   <chr>   <chr>
#1       1       A    ABCD
#2       1       B    ABCD
#3       1       C    ABCD
#4       1       D    ABCD
#5       2       B    BCDE
#6       2       C    BCDE
#7       2       D    BCDE
#8       2       E    BCDE


或者另一个选择是 data.table

library(data.table)
setDT(df1)[,  ColumnC := paste(ColumnB, collapse=""), by = ColumnA]

数据

df1 <- structure(list(ColumnA = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), ColumnB = c("A", 
 "B", "C", "D", "B", "C", "D", "E")), .Names = c("ColumnA", "ColumnB"
 ), class = "data.frame", row.names = c(NA, -8L))


如果我们需要 python ,则

>>> import pandas as pd;
>>> df1 = pd.read_clipboard()
>>> df1
#   ColumnA ColumnB
#1        1       A
#2        1       B
#3        1       C
#4        1       D
#5        2       B
#6        2       C
#7        2       D
#8        2       E
>>> df1['ColumnC'] = df1.groupby('ColumnA')['ColumnB'].transform(lambda x: ''.join(x))
>>> df1
#   ColumnA ColumnB ColumnC
#1        1       A    ABCD
#2        1       B    ABCD
#3        1       C    ABCD
#4        1       D    ABCD
#5        2       B    BCDE
#6        2       C    BCDE
#7        2       D    BCDE
#8        2       E    BCDE

这篇关于连接数据框中的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆