使用dplyr中的purrr和mutate()将新变量添加到数据帧列表中 [英] Add new variable to list of data frames with purrr and mutate() from dplyr

查看:105
本文介绍了使用dplyr中的purrr和mutate()将新变量添加到数据帧列表中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道这里有许多相关问题,但是我正在寻找一种Purrr解决方案,请不要从函数的应用列表或cbind / rbdind中选择一个(我想借此机会更好地了解purrr)。

I know that there are many related questions here on SO, but I am looking for a purrr solution, please, not one from the apply list of functions or cbind/rbdind (I want to take this opportunity to get to know purrr better).

我有一个数据框列表,我想为列表中的每个数据框添加一个新列。该列的值将是数据框的名称,即列表中每个元素的名称。

I have a list of dataframes and I would like to add a new column to each dataframe in the list. The value of the column will be the name of the dataframe, i.e. the name of each element in the list.

有些类似的内容,但是它涉及到函数的使用和 mutate_each(),而我只需要 mutate()

There is something similar here, but it involves the use of a function and mutate_each(), whereas I need just mutate().

要使您对列表有个了解(称为 comentarios ),这是<$ c的第一行$ c> str()在第一个元素上:

To give you an idea of the list (called comentarios), here is the first line of str() on the first element:

> str(comentarios[1])
List of 1
 $ 166860353356903_661400323902901:'data.frame':    13 obs. of  7 variables:

所以我希望我的新变量包含 166860353356903_661400323902901 表示结果中的13行,作为每个数据帧的ID。

So I would like my new variable to contain 166860353356903_661400323902901 for 13 lines in the result, as an ID for each dataframe.

我正在尝试的是:

dff <- map_df(comentarios, 
              ~ mutate(ID = names(comentarios)),
              .id = "Group"
              )

但是, mutate()需要数据框的名称才能工作:

However, mutate() needs the name of the dataframe in order to work:

Error in mutate_(.data, .dots = lazyeval::lazy_dots(...)) : 
  argument ".data" is missing, with no default

输入每个名称都没有意义,我会迷路进入循环区域并失去了purrr(和R,更一般而言)的优势。如果列表较小,我将使用 reshape :: merge_all(),但其中有2000多个元素。在此先感谢您的帮助。

It doesn't make sense to put in each name, I'd be straying into loop territory and losing the advantages of purrr (and R, more generally). If the list was smaller, I'd use reshape::merge_all(), but it has over 2000 elements. Thanks in advance for any help.

编辑:根据alistaire的评论,一些数据可再现该问题

# install.packages("tidyverse")
library(tidyverse)
df <- data_frame(one = rep("hey", 10), two = seq(1:10), etc = "etc")

list_df <- list(df, df, df, df, df)
names(list_df) <- c("first", "second", "third", "fourth", "fifth")
dfs <- map_df(list_df, 
              ~ mutate(id = names(list_df)),
              .id = "Group"
              )


推荐答案

您的问题是,当您不对管道使用mutate时,必须显式提供对数据的引用。为此,我建议使用 map2_df

Your issue is that you have to explicitly provide reference to the data when you're not using mutate with piping. To do this, I'd suggest using map2_df

dff <- map2_df(comentarios, names(comentarios), ~ mutate(.x, ID = .y)) 

这篇关于使用dplyr中的purrr和mutate()将新变量添加到数据帧列表中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆