dplyr：substr的向量化 [英] dplyr: vectorisation of substr

查看：78 发布时间：2020/10/26 3:27:09 r dplyr

本文介绍了dplyr：substr的向量化的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

请参考问题将dplyr％>％mutate替换为，对于@akrun的答案，为什么两个创建的列给出相同的答案？

Referring to question substr in dplyr %>% mutate, and to @akrun 's answer, why do the two created columns give the same answer?

df <- data_frame(t = '1234567890ABCDEFG', a = 1:5, b = 6:10)
df %>%  mutate(u = substr(t, a,  a + b), v = substring(t, a,  a + b))

我无法理解原始问题与情况的区别。
谢谢！

I can't grasp the difference with the situation in the original question. Thank you!

推荐答案

区别在于矢量化

substr("1234567890ABCDEFG", df$a, df$a+df$b)
#[1] "1234567"
substring("1234567890ABCDEFG", df$a, df$a+df$b)
#[1] "1234567"     "23456789"    "34567890A"   "4567890ABC"  "567890ABCDE"

substr 仅返回单个值，而 substring 返回长度的向量等于数据集'df'中的行数。由于只有一个值输出，因此会在 mutate 中对其进行回收。但是，如果我们使用多个值，即

The substr returns only a single value while the substring returns a vector of length equal to the number of rows in the dataset 'df'. As there is only a single value output, it gets recycled in the mutate. However, if we are using multiple values i.e.

substr(rep("1234567890ABCDEFG", nrow(df)), df$a, df$a+df$b)
#[1] "1234567"     "23456789"    "34567890A"   "4567890ABC"  "567890ABCDE"
substring(rep("1234567890ABCDEFG", nrow(df)), df$a, df$a+df$b)
#[1] "1234567"     "23456789"    "34567890A"   "4567890ABC"  "567890ABCDE"

然后，输出是相同的。在OP的示例中，它得到上述输出，因为 substr 中的 x 具有与<$ c $相同的长度。 c>开始和停止。我们可以使用

Then, the output is the same. In the OP's example, it gets the above output as the x in substr has the same length as start and stop. We can replicate the first output with

 df %>%
     mutate(u = substr("1234567890ABCDEFG", a, a+b),
            v = substring("1234567890ABCDEFG", a, a+b)) 
#                 t     a     b       u           v
#              (chr) (int) (int)   (chr)       (chr)
#1 1234567890ABCDEFG     1     6 1234567     1234567
#2 1234567890ABCDEFG     2     7 1234567    23456789
#3 1234567890ABCDEFG     3     8 1234567   34567890A
#4 1234567890ABCDEFG     4     9 1234567  4567890ABC
#5 1234567890ABCDEFG     5    10 1234567 567890ABCDE

这篇关于dplyr：substr的向量化的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

dplyr：substr的向量化 [英] dplyr: vectorisation of substr

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

dplyr：substr的向量化 [英] dplyr: vectorisation of substr

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭