在r中的group_by之后建模后取消嵌套列表列 [英] unnest a list column after modeling after group_by in r
问题描述
我想对 group_by
之后的所有组进行线性回归,将模型系数保存在列表列中,然后使用"unnest"扩展列表列".这里我以 mtcars
数据集为例.
I want to do linear regression for all groups after group_by
, save the model coefficients in a list column and then 'expand the list column' using 'unnest'.
Here I use mtcars
dataset as an example.
注意:我想在这里使用 do',因为
broom :: tidy`不适用于所有型号.
Note: I want to use do' here, because
broom::tidy` does not work for all models.
mtcars %>% group_by(cyl) %>%
do(model=lm(mpg~wt+hp, data=.)) %>%
mutate(coefs = list(summary(model)$coefficients)) %>%
unnest()
我想要这样的东西.
cyl term Estimate Std. Error t value Pr(>|t|)
4 (Intercept) 36.9083305 2.19079864 16.846975 1.620660e-16
4 wt -2.2646936 0.57588924 -3.932516 4.803752e-04
4 hp -0.0191217 0.01500073 -1.274718 2.125285e-01
6.......
6......
........
我收到如下错误:
Error: All nested columns must have the same number of elements.
任何人都可以帮助解决此问题吗?尝试了很多次后,我无法理解它.
Can anyone help solving this issue? I could not figure it our after trying so many times...
推荐答案
一种选择是提取'coefs'列(.$ coefs
),设置列表的名称
列和'cyl'列,并通过 map
遍历 list
,将其转换为 data.frame
,创建一个新列基于行名,并使用 .id
从 list
One option would be to extract the 'coefs' column (.$coefs
), set the names of the list
column with 'cyl' column, loop through the list
with map
, convert it to data.frame
, create a new column based on the row names and use the .id
to create the 'cyl' column from the names
of the list
library(tidyverse)
mtcars %>%
group_by(cyl) %>%
do(model=lm(mpg~ wt + hp, data=.)) %>%
mutate(coefs = list(summary(model)$coefficients)) %>%
select(-model) %>%
{set_names(.$coefs, .$cyl)} %>%
map_df(~ .x %>%
as.data.frame %>%
rownames_to_column('term'), .id = 'cyl')
# cyl term Estimate Std. Error t value Pr(>|t|)
#1 4 (Intercept) 45.83607319 4.78693568 9.575243 1.172558e-05
#2 4 wt -5.11506233 1.60247105 -3.191984 1.276524e-02
#3 4 hp -0.09052672 0.04359827 -2.076383 7.151610e-02
#4 6 (Intercept) 32.56630096 5.57482132 5.841676 4.281411e-03
#5 6 wt -3.24294031 1.37365306 -2.360815 7.759393e-02
#6 6 hp -0.02219994 0.02017664 -1.100279 3.329754e-01
#7 8 (Intercept) 26.66393686 3.66217797 7.280896 1.580743e-05
#8 8 wt -2.17626765 0.72094143 -3.018647 1.168393e-02
#9 8 hp -0.01367295 0.01073989 -1.273099 2.292303e-01
如果我们想使用 tidy
,则将 map_df
的内容更改为
If we wanted to use tidy
, then change the contents of map_df
to
... %>%
map_df(~ .x %>%
broom::tidy(.), .id = 'cyl')
此外,另一种选择是在 group_by
之后进行 nest
,然后在 model
broom :: tidy >对象,然后 unnest
Also, another option is to nest
after group_by
and then apply the broom::tidy
on the model
object and then unnest
mtcars %>%
group_by(cyl) %>%
nest %>%
mutate(data = map(data, ~ .x %>%
summarise(model = list(broom::tidy(lm(mpg ~ wt + hp)))))) %>%
unnest %>%
unnest
这篇关于在r中的group_by之后建模后取消嵌套列表列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!