连接数据框中的行 [英] Concatenate rows in a dataframe
本文介绍了连接数据框中的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个结构如下的数据框:
I have a dataframe structured like below:
Column A Column B
1 A
1 B
1 C
1 D
2 B
2 C
2 D
2 E
我想将A列中属于特定值的所有行连接起来.
I want to concatenate all the rows belonging to a specific value in column A.
我希望最终输出看起来像这样:
I want the final output to look something like this:
Column A Column B Column C
1 A ABCD
1 B ABCD
1 C ABCD
1 D ABCD
2 B BCDE
2 C BCDE
2 D BCDE
2 E BCDE
我将如何在R/Python中执行此操作?
How would I perform this operation in R/ Python?
谢谢
推荐答案
在 R
中,我们可以使用 dplyr
.按"ColumnA"分组后,粘贴
"ColumnB"的内容,并使用 mutate
In R
, we can use dplyr
. After grouping by 'ColumnA', paste
the contents of 'ColumnB' and create a new column with mutate
library(dplyr)
df1 %>%
group_by(ColumnA) %>%
mutate(ColumnC = paste(ColumnB, collapse=""))
# A tibble: 8 x 3
# Groups: ColumnA [2]
# ColumnA ColumnB ColumnC
# <int> <chr> <chr>
#1 1 A ABCD
#2 1 B ABCD
#3 1 C ABCD
#4 1 D ABCD
#5 2 B BCDE
#6 2 C BCDE
#7 2 D BCDE
#8 2 E BCDE
或者另一个选择是 data.table
library(data.table)
setDT(df1)[, ColumnC := paste(ColumnB, collapse=""), by = ColumnA]
数据
df1 <- structure(list(ColumnA = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), ColumnB = c("A",
"B", "C", "D", "B", "C", "D", "E")), .Names = c("ColumnA", "ColumnB"
), class = "data.frame", row.names = c(NA, -8L))
如果我们需要 python
,则
>>> import pandas as pd;
>>> df1 = pd.read_clipboard()
>>> df1
# ColumnA ColumnB
#1 1 A
#2 1 B
#3 1 C
#4 1 D
#5 2 B
#6 2 C
#7 2 D
#8 2 E
>>> df1['ColumnC'] = df1.groupby('ColumnA')['ColumnB'].transform(lambda x: ''.join(x))
>>> df1
# ColumnA ColumnB ColumnC
#1 1 A ABCD
#2 1 B ABCD
#3 1 C ABCD
#4 1 D ABCD
#5 2 B BCDE
#6 2 C BCDE
#7 2 D BCDE
#8 2 E BCDE
这篇关于连接数据框中的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文