在过滤的每一步打印数据框尺寸 [英] Print data frame dimensions at each step of filtering

查看:20
本文介绍了在过滤的每一步打印数据框尺寸的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 tidyverse 过滤掉一个数据框,并希望在中间对象的维度(或 nrows)的每一步打印.我以为我可以简单地使用 magrittr 的三通管操作器,但它不起作用.我想我了解 T 形管背后的概念,但无法弄清楚出了什么问题.我进行了广泛的搜索,但没有找到太多关于 T 形管的资源.

I am using the tidyverse to filter out a dataframe and would like a print at each step of the dimensions (or nrows) of the intermediate objects. I thought I could simply use a tee pipe operator from magrittr but it doesn't work. I think I understand the concept behind the tee pipe but can't figure out what is wrong. I searched extensively but didn't find much resources about the tee pipe.

我使用 mtcars 数据集构建了一个简单示例.打印中间对象有效,但如果我用 dim() 或 nrow() 替换则无效.

I built a simple example with the mtcars dataset. Printing the intermediate objects works but not if I replace with dim() or nrow().

library(tidyverse)
library(magrittr)

mtcars %>% 
    filter(cyl > 4) %T>% dim() %>%
    filter(am == 0) %T>% dim() %>%
    filter(disp >= 200) %>% dim()

我当然可以在 R 基础上写,但我想坚持 tidyverse 精神.我可能忽略了有关三通管概念的内容,任何意见/解决方案将不胜感激.

I can of course write that in R base but would like to stick to the tidyverse spirit. I probably underlooked something about tee pipe concept and any comments/solutions will be greatly appreciated.

按照@hrbrmstr 和@akrun 的快速回答,我再次尝试坚持使用 tee pipe 运算符而不编写函数.我不知道为什么我自己没有早点找到答案,但这是我正在寻找的语法:

Following @hrbrmstr and @akrun nice and quick answers, I tried again to stick to tee pipe operator without writing a function. I don't know why I didn't find out the answer earlier myself but here is the syntax I was looking for:

mtcars %>%过滤器(cyl > 4)%T>% {print(dim(.))} %>%filter(am == 0) %T>% {print(dim(.))} %>%filter(disp >= 200) %>% {print(dim(.))}

尽管需要一个函数,@hrbrmstr 解决方案确实更容易清理".

Despite the need of a function, @hrbrmstr solution is indeed easier to "clean up".

推荐答案

@akrun 的想法可行,但它不是惯用的 tidyverse.tidyverse 中的其他函数,如 print()glimpse() 不可见地返回数据参数,因此它们可以通过管道传输而无需求助于 {}.那些 {} 使您在探索完正在发生的事情后很难清理管道.

@akrun's idea works, but it's not idiomatic tidyverse. Other functions in the tidyverse, like print() and glimpse() return the data parameter invisibly so they can be piped without resorting to {}. Those {} make it difficult to clean up pipes after your done exploring what's going on.

试试:

library(tidyverse)

tidydim <- function(x) {
  print(dim(x))
  invisible(x)
}

mtcars %>%
  filter(cyl > 4) %>%
  tidydim() %>% 
  filter(., am == 0) %>%
  tidydim() %>% 
  filter(., disp >= 200) %>%
  tidydim()

这样你的清理"(即不产生临时控制台输出)可以快速/轻松地删除 tidydim() 删除 print(...) 来自函数.

That way your "cleanup" (i.e. not producing interim console output) canbe to quickly/easily remove the tidydim() lines or remove the print(…) from the function.

这篇关于在过滤的每一步打印数据框尺寸的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆