为什么每次使用dplyr的mutate时，digest函数返回相同的值？ [英] Why does the digest function return the same value every time when used with dplyr's mutate?

查看：98 发布时间：2017/7/13 20:41:08 r dplyr

本文介绍了为什么每次使用dplyr的mutate时，digest函数返回相同的值？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是一个包含用户ID列的数据框：

Here's a data frame containing a column of user ids:

> head(df)
       uid
1 14070210
2 14080815
3 14091420

为了参数，我想创建一个包含用户id的平方根的新列，另一个包含用户标识的哈希的新列。所以我这样做：

For the sake of argument, I want to create a new column containing the square root of the user id, and another new column containing a hash of the user id. So I do this:

df_mutated <- df %>%
              mutate(sqrt_uid = sqrt(uid), hashed_uid = digest(uid))

... digest（）来自摘要包。

... where digest() comes from the digest package.

平方根似乎起作用，摘要函数为每个用户标识返回相同的值。

While the square root appears to work, the digest function returns the same value for each user id.

> head(df_mutated)
       uid sqrt_uid                       hashed_uid
1 14070210 3751.028 f8c4b39403e57d85cd1698d2353954d0
2 14080815 3752.441 f8c4b39403e57d85cd1698d2353954d0
3 14091420 3753.854 f8c4b39403e57d85cd1698d2353954d0

对我来说这很奇怪。没有dplyr，digest（）函数会为不同的输入返回不同的值。我不了解dplyr？

This is weird to me. Without dplyr, the digest() function returns different values for different inputs. What am I not understanding about dplyr?

谢谢

推荐答案

digest（）函数没有向量化。所以如果你传递一个向量，你会得到一个整数向量的值，而不是一个向量的每个元素的摘要。由于它返回一个值，所以该值将针对您的data.frame的每一行进行回收。您可以创建自己的向量化版本

The digest() function isn't vectorized. So if you pass in a vector, you get one value for the whole vector rather than a digest for each element of the vector. Since it returns one value, that value is recycled for each row of your data.frame. You can create your own vectorized version

vdigest <- Vectorize(digest)
df %>% mutate(sqrt_uid = sqrt(uid), hashed_uid = vdigest(uid))
#        uid sqrt_uid                       hashed_uid
# 1 14070210 3751.028 cc90019421220a24f75b5ed5daec36ff
# 2 14080815 3752.441 9f7f643940b692dd9c7effad439547e8
# 3 14091420 3753.854 89e6666fdfdbfb532b2d7940def9d47d

与您单独传递每个向量元素时获得的匹配

which matches what you get when you pass in each vector element individually

digest(df$uid[1])
# [1] "cc90019421220a24f75b5ed5daec36ff"
digest(df$uid[3])
# [1] "89e6666fdfdbfb532b2d7940def9d47d"

这篇关于为什么每次使用dplyr的mutate时，digest函数返回相同的值？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

为什么每次使用dplyr的mutate时，digest函数返回相同的值？ [英] Why does the digest function return the same value every time when used with dplyr's mutate?

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

为什么每次使用dplyr的mutate时，digest函数返回相同的值？ [英] Why does the digest function return the same value every time when used with dplyr&#39;s mutate?

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

为什么每次使用dplyr的mutate时，digest函数返回相同的值？ [英] Why does the digest function return the same value every time when used with dplyr's mutate?

登录关闭