使用来自 dplyr 的 purrr 和 mutate() 将新变量添加到数据框列表中 [英] Add new variable to list of data frames with purrr and mutate() from dplyr

查看:14
本文介绍了使用来自 dplyr 的 purrr 和 mutate() 将新变量添加到数据框列表中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道这里有很多相关的问题,但我正在寻找一个 purrr 解决方案,请不要从应用函数列表或 cbind/rbdind 中找到一个(我想采取这是一个更好地了解 purrr 的机会).

I know that there are many related questions here on SO, but I am looking for a purrr solution, please, not one from the apply list of functions or cbind/rbdind (I want to take this opportunity to get to know purrr better).

我有一个数据框列表,我想为列表中的每个数据框添加一个新列.列的值将是数据框的名称,即列表中每个元素的名称.

I have a list of dataframes and I would like to add a new column to each dataframe in the list. The value of the column will be the name of the dataframe, i.e. the name of each element in the list.

有类似的这里,但它涉及使用函数和mutate_each(),而我只需要mutate().

There is something similar here, but it involves the use of a function and mutate_each(), whereas I need just mutate().

为了让您了解列表(称为 commentarios),这里是第一个元素上的 str() 的第一行:

To give you an idea of the list (called comentarios), here is the first line of str() on the first element:

> str(comentarios[1])
List of 1
 $ 166860353356903_661400323902901:'data.frame':    13 obs. of  7 variables:

所以我希望我的新变量在结果中包含 13 行 166860353356903_661400323902901,作为每个数据帧的 ID.

So I would like my new variable to contain 166860353356903_661400323902901 for 13 lines in the result, as an ID for each dataframe.

我正在尝试的是:

dff <- map_df(comentarios, 
              ~ mutate(ID = names(comentarios)),
              .id = "Group"
              )

然而,mutate() 需要数据帧的名称才能工作:

However, mutate() needs the name of the dataframe in order to work:

Error in mutate_(.data, .dots = lazyeval::lazy_dots(...)) : 
  argument ".data" is missing, with no default

输入每个名称是没有意义的,我会误入循环领域并失去 purrr(和 R,更一般地说)的优势.如果列表更小,我会使用 reshape::merge_all(),但它有超过 2000 个元素.在此先感谢您的帮助.

It doesn't make sense to put in each name, I'd be straying into loop territory and losing the advantages of purrr (and R, more generally). If the list was smaller, I'd use reshape::merge_all(), but it has over 2000 elements. Thanks in advance for any help.

根据 alistaire 的评论,一些数据使问题可重现

# install.packages("tidyverse")
library(tidyverse)
df <- data_frame(one = rep("hey", 10), two = seq(1:10), etc = "etc")

list_df <- list(df, df, df, df, df)
names(list_df) <- c("first", "second", "third", "fourth", "fifth")
dfs <- map_df(list_df, 
              ~ mutate(id = names(list_df)),
              .id = "Group"
              )

推荐答案

您的问题是,当您不将 mutate 与管道一起使用时,您必须明确提供对数据的引用.为此,我建议使用 map2_df

Your issue is that you have to explicitly provide reference to the data when you're not using mutate with piping. To do this, I'd suggest using map2_df

dff <- map2_df(comentarios, names(comentarios), ~ mutate(.x, ID = .y)) 

这篇关于使用来自 dplyr 的 purrr 和 mutate() 将新变量添加到数据框列表中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆