使用 tidyverse 有什么缺点吗? [英] Are there any disadvantages to using tidyverse?

查看:37
本文介绍了使用 tidyverse 有什么缺点吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于与在 R 中处理数据相关的任何事情,我最近看到推荐的 tidyverse 几乎是必不可少的.这就提出了一个问题——如果它只是被大肆宣传,是否有任何理由不使用它?例如,tidyverse 中的框架是否有任何值得一提的限制?

For anything related to processing data in R, I've recently been seeing tidyverse recommended as almost essential. This raises a question - if it is all that it's hyped up to be, is there any reason not to use it? For example, are the frameworks in tidyverse restrictive in any way that is worthy of mention?

推荐答案

第一个缺点:稳定性

一个缺点是 tidyverse 函数的变化比 base R 更快.所以如果你想要长时间的稳定性,我会选择 base R.那说,tidyverse 开发人员对他们的不同方法持开放态度.见例如欢迎来到 Tidyverse 小插图:

One drawback is that the tidyverse functions change more rapid than, say, base R. So if you want stability over long time I would go for base R. That said, the tidyverse developers are open about their different approach. See e.g. the Welcome to the Tidyverse vignette:

[base R 和 tidyverse] 的最大区别在于优先级:base R 高度关注稳定性,而 tidyverse 将在寻找更好的接口时做出重大改变.

the biggest difference [between base R and tidyverse] is in priorities: base R is highly focussed on stability, whereas the tidyverse will make breaking changes in the search for better interfaces.

...还有哈德利对 您是否希望有一天 tidyverse 成为核心 R 软件包的一部分?

这是极不可能的,因为核心包非常保守,所以 base R 代码是稳定的,并且向后兼容.我更喜欢采用更乌托邦式的方法,在这种方法中,我可以在尝试找出更好的 API 的同时,积极地做出向后不兼容的更改.

It’s extremely unlikely because the core packages are extremely conservative so that base R code is stable, and backward compatible. I prefer to have a more utopian approach where I can be quite aggressive about making backward incompatible changes while trying to figure out a better API.

第二个缺点:灵活性

tidy data 概念很棒,但在转换后具有与以前相同的行号的模仿(参见 mutate)并不总是可能的.参见示例

The tidy data concept is great but the Iimitation to have same row number after transformation as before (see mutate) is not always possible. See for example

library(tidyverse)
data.frame(matrix(rnorm(1000), ncol= 10)) %>%
mutate_all(function(i) density(i)$x)

由于行号更改而导致错误.有时我会遇到 mutate 抱怨行号不一样的情况.例如,与 summarise 类似,它期望每列只有一个长度,而 range 就不是这种情况.当然有解决方法,但我更喜欢这里的基本 R

which gives an error because row number changes. Sometime I run into situations like that where mutate complains that row number is not the same. It is similiar with summarise that expects only length one per column which is not the case for range, for instance. There are workarounds, for sure, but I prefer base R that here would simply be

apply(data.frame(matrix(rnorm(1000), ncol= 10)), 2, function(i) density(i)$x)

第三个缺点:复杂

在某些情况下,tidyverse 可以工作,但要麻烦得多.前段时间我问了一个问题做这个代码

There are situations where the tidyverse works but is much more cumbersome. Some time ago I asked a question how to do this code

df[df$age > 90, ] <- NA

在 tidyverse 和建议使用的两个答案中

within the tidyverse and the two answers suggested using

df %>% select(x, y, age) %>% mutate_all(~replace(.x, age> 90, NA))
# or
df %>% mutate_all(function(i) replace(i, .$age> 90, NA))

两个答案都有效,但显然不如使用基础 R 快速编码.

Both answers work but are obviously not as quick to code as with base R.

为什么这个问题不应该被关闭

该问题已作为重复关闭,并链接到另一个关于 tidyverse 与 data.table 的问题.在我看来,如果有人询问 tidyverse(或任何其他包)的缺点,这并不意味着该人要求与 data.table 包进行比较.相反,通过将 tidyverse 与在链接问题中完成的 R base 进行比较,可以更明显地看出 tidyverse 的缺点,例如这个问题不是重复的.

The question was closed as a duplicate and linked to another about tidyverse vs. data.table. In my opinion, if someone asks about disadvantages of tidyverse (or any other package) this does not mean the person is asking for a comparison with the data.table package. Instead, it is more obvious to tell the disadvantages of tidyverse by comparing it with R base which is not done in the linked question, e.g. this question is not a duplicate.

这篇关于使用 tidyverse 有什么缺点吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆