在提提尔语和dplyr中按模式(单词)分开 [英] separate by pattern (word) in tidyr and dplyr

查看:63
本文介绍了在提提尔语和dplyr中按模式(单词)分开的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个非常简单的需求:在dplyr管道链中将一列拆分为两个新列.诀窍是使用一个特定的单词作为分隔符,而不是单个字符.

I have a very simple need: split a column into two new columns inside a chain of dplyr pipes. The trick here is doing it using a specific word as separator instead a single character.

数据:

id    elements
1     banana and apple
2     orange and lemon
3     house and flat

预期结果

id    element1    element2
1      banana      apple
2      orange      lemon
3      house       flat

很显然,tidyr :: separate方法无法按预期工作(我不好).分隔是通过单词"and"的第一个字母完成的.

obviously, the tidyr::separate approach is not working as expected (my bad). Separation is done by first letter of word "and".

df %>% tidyr::separate(elements, into = c("element1","element2"), sep = "and")

我知道这可以用其他动词来实现,但我的主要目标是尽可能使用dplyr和tidyr.

I know this maybe can be achieved with other verbs but my main target is to do it using dplyr and tidyr if possible.

推荐答案

我们可以在和之前和之后指定空格,也可以将其删除

We can specify the space before and after the and as well to remove them

library(dplyr)
library(tidyr)
df %>%
   separate(elements, into = c('element1', 'element2'),
          sep = '\\s*and\\s*')

-输出

#  id element1 element2
#1  1   banana    apple
#2  2   orange    lemon
#3  3    house     flat

数据

df <- structure(list(id = 1:3, elements = c("banana and apple", 
"orange and lemon", 
"house and flat")), class = "data.frame", row.names = c(NA, -3L
))

这篇关于在提提尔语和dplyr中按模式(单词)分开的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆