在dplyr mutate调用中添加多个列 [英] Adding multiple columns in a dplyr mutate call

查看：130 发布时间：2017/7/13 20:16:33 r dplyr

本文介绍了在dplyr mutate调用中添加多个列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个带有点分隔字符列的数据框：

 > set.seed（310366）
> tst = data.frame（x = 1：10，y = paste（sample（c（FOO，BAR，BAZ），10，TRUE），。，样本（c（foo bar，baz），10，TRUE），sep =））
> tst 
xy 
 1 1 BAR.baz 
 2 2 FOO.foo 
 3 3 BAZ.baz 
 4 4 BAZ.foo 
 5 5 BAZ.bar 
 6 6 FOO.baz 
 7 7 BAR.bar 
 8 8 BAZ.baz

，我想将该列分成两列，其中包含点的任一侧的部分。 str_split_fixed 从包 stringr 可以做得很好。我所有的价值观绝对是由点分开的两部分，所以我可以做：

 > require（stringr）
> str_split_fixed（tst $ y，\\，2）
 [，1] [，2] 
 [1，]BARbaz
 [2，] FOOfoo
 [3，]BAZbaz
 [4，]BAZfoo
 [5，]BAZbar
 [6，]FOObaz
 [7，]BARbar

现在我可以只是 cbind 这个数据框架，但是我以为我会弄清楚如何在 dplyr 管道。首先，我认为 mutate 可以在其中执行：

 > tst％。％mutate（parts = str_split_fixed（y，\\。，2））
错误：错误的结果大小（20），预期为10或1

我可以得到 mutate 来做到这一点：

 > tst％。％mutate（part1 = str_split_fixed（y，\\。，2）[，1]，part2 = str_split_fixed（y，\\。，2）[，2]）
xy part1 part2 
 1 1 BAR.baz BAR baz 
 2 2 FOO.foo FOO foo 
 3 3 BAZ.baz BAZ baz 
 4 4 BAZ.foo BAZ foo 
 5 5 BAZ.bar BAZ bar 
 6 6 FOO.baz FOO baz

但这是运行字符串拆分两次。

最好的我可以在一个 dplyr 我只是在写这个问题时才发现...）：

 > tst％。％do（cbind（。，data.frame（parts = str_split_fixed（。$ y，\\。，2））））
xy parts.1 part.2 
 1 1 BAR.baz BAR baz 
 2 2 FOO.foo FOO foo 
 3 3 BAZ.baz BAZ baz 
 4 4 BAZ.foo BAZ foo 
 5 5 BAZ.bar BAZ酒吧

这不错，但是在R中丢失了很多可管理的东西的可读性。有没有一个简单的方法，使用我错过的 mutate

解决方案

你可以使用 separate（）从 tidyr 与 dplyr ：

  tst％>％separate（y，c（y1，y2），sep =\ \。，remove = FALSE）
 
xy y1 y2 
 1 1 BAR.baz BAR baz 
 2 2 FOO.foo FOO foo 
 3 3 BAZ.baz BAZ baz 
 4 4 BAZ.foo BAZ foo 
 5 5 BAZ.bar BAZ bar 
 6 6 FOO.baz FOO baz 
 7 7 BAR.bar BAR bar 
 8 8 BAZ.baz BAZ baz 
 9 9 FOO.bar FOO酒吧
 10 10 BAR.foo BAR foo

设置 remove = TRUE 将删除列y

 
I have a data frame with a dot-separated character column:
> set.seed(310366)
> tst = data.frame(x=1:10,y=paste(sample(c("FOO","BAR","BAZ"),10,TRUE),".",sample(c("foo","bar","baz"),10,TRUE),sep=""))
> tst
    x       y
1   1 BAR.baz
2   2 FOO.foo
3   3 BAZ.baz
4   4 BAZ.foo
5   5 BAZ.bar
6   6 FOO.baz
7   7 BAR.bar
8   8 BAZ.baz
and I want to split that column into two new columns containing the parts on either side of the dot. str_split_fixed from package stringr can do the job quite nicely. All my values are definitely two parts separated by a dot so I can do:
> require(stringr)
> str_split_fixed(tst$y,"\\.",2)
      [,1]  [,2] 
 [1,] "BAR" "baz"
 [2,] "FOO" "foo"
 [3,] "BAZ" "baz"
 [4,] "BAZ" "foo"
 [5,] "BAZ" "bar"
 [6,] "FOO" "baz"
 [7,] "BAR" "bar"
Now I could just cbind that to my data frame but I thought I'd figure out how to do that in a dplyr pipeline. First I thought mutate could do it in one:
> tst %.% mutate(parts=str_split_fixed(y,"\\.",2))
Error: wrong result size (20), expected 10 or 1
I can get mutate to do it in two:
> tst %.% mutate(part1=str_split_fixed(y,"\\.",2)[,1], part2=str_split_fixed(y,"\\.",2)[,2])
    x       y part1 part2
1   1 BAR.baz   BAR   baz
2   2 FOO.foo   FOO   foo
3   3 BAZ.baz   BAZ   baz
4   4 BAZ.foo   BAZ   foo
5   5 BAZ.bar   BAZ   bar
6   6 FOO.baz   FOO   baz
but that's running the string split twice.

"Best" I can do so far in a dplyr way is this (which I only discovered while writing this question...):
> tst %.% do(cbind(.,data.frame(parts=str_split_fixed(.$y,"\\.",2))))
    x       y parts.1 parts.2
1   1 BAR.baz     BAR     baz
2   2 FOO.foo     FOO     foo
3   3 BAZ.baz     BAZ     baz
4   4 BAZ.foo     BAZ     foo
5   5 BAZ.bar     BAZ     bar
which isn't bad, but loses a lot of the readability of piped things in R. Is there a simple approach using mutate that I've missed?
 解决方案 
You can use separate() from tidyr in combination with dplyr:
tst %>% separate(y, c("y1", "y2"), sep = "\\.", remove=FALSE)

    x       y  y1  y2
1   1 BAR.baz BAR baz
2   2 FOO.foo FOO foo
3   3 BAZ.baz BAZ baz
4   4 BAZ.foo BAZ foo
5   5 BAZ.bar BAZ bar
6   6 FOO.baz FOO baz
7   7 BAR.bar BAR bar
8   8 BAZ.baz BAZ baz
9   9 FOO.bar FOO bar
10 10 BAR.foo BAR foo
Setting remove=TRUE will remove column y

                        这篇关于在dplyr mutate调用中添加多个列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

在dplyr mutate调用中添加多个列 [英] Adding multiple columns in a dplyr mutate call

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

在dplyr mutate调用中添加多个列 [英] Adding multiple columns in a dplyr mutate call

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭