根据dplyr中每列中的数据合并数据帧 [英] Combining data frame based on data in each column in dplyr

查看:65
本文介绍了根据dplyr中每列中的数据合并数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我有一些网络数据,如下所示:

Say I have some network data as shown below:

col_a <- c("A","B","C")
col_b <- c("B","A","A")
val <- c(1,3,7)
df <- data.frame(col_a, col_b, val)
df

  col_a col_b val
1     A     B   1
2     B     A   3
3     C     A   7

这可能是一个网络,而val可能是两者之间边缘的权重。但是,我想在A和B与B和A之间增加权重:

This could be a network and val could be the weight of the edges between the two. However, I want to add the weight between A and B and B and A to get the following:

new_col_a <- c("A", "A")
new_col_b <- c("B", "C")
new_val <- c(4,7)
want_df <- data.frame(new_col_a, new_col_b, new_val)
want_df

  new_col_a new_col_b new_val
1         A         B       4
2         A         C       7

dplyr 中是否可以做到这一点?

Is there a way to do this in dplyr?

推荐答案

您可以为此使用 dplyr

df <- data.frame(col_a, col_b, val, stringsAsFactors = F)

library(dplyr)
library(tidyr)
df %>% 
  mutate(
    pair = purrr::pmap_chr(
      .l = list(from = col_a, to = col_b),
      .f = function(from, to) paste(sort(c(from, to)), collapse = "_")
    )
  ) %>%
  group_by(pair) %>%
  summarise(new_val = sum(val)) %>%
  separate(pair, c("new_col_a", "new_col_b"), sep = "_")
  # A tibble: 2 x 3
  new_col_a new_col_b new_val
  <chr>     <chr>       <dbl>
1 A         B               4
2 A         C               7

类似于我以前的< a href = https://stackoverflow.com/questions/55645739/select-the-most-common-value-of-a-column-based-on-matched-pairs-from-two-columns/55646183#55646183 > answers

这篇关于根据dplyr中每列中的数据合并数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆