每两个字符多次分割字符串 [英] Split character string multiple times every two characters
问题描述
我的数据框中有一个字符列,看起来像
I have a character column in my dataframe that looks like
df<-
data.frame(a=c("AaBbCC","AABBCC","AAbbCC"))#df
a
1 AaBbCC
2 AABBCC
3 AAbbCC
我想将这个列分成两个字符。所以在这种情况下,我想获得名为 VA,VB,VC
的三列。
我试过
I would like to split this column every two characters. So in this case I would like to obtain three columns named VA,VB,VC
.
I tried
library(tidyr)
library(dplyr)
df<-
data.frame(a=c("AaBbCC","AABBCC","AAbbCC"))%>%
separate(a,c(paste("V",LETTERS[1:3],sep="")),sep=c(2,2))
VA VB VC
1 Aa BbCC
2 AA BBCC
3 AA bbCC
但这不是所需的结果。我喜欢将结果现在位于 VC
分成 VB
(全部字母B)和 VC
(全字母C)如何让R每两个字符分割。列中的字符串的长度对于每一行始终是相同的(在本示例中为6)。
我将具有长度> 10的字符串。
but this is not the desired result. I like to have the result that is now in VC
split into VB
(all letter B) and VC
(all letter C)How do I get R to split every two characters. The length of the string in the column is always the same for every row (6 in this example).
I will have strings that are of length >10.
推荐答案
你实际上很接近。您需要将分隔符位置指定为 sep = c(2,4)
而不是 sep = c(2,2)
You were actually quite close. You need to specify the separator-positions as sep = c(2,4)
instead of sep = c(2,2)
:
df <- separate(df, a, c(paste0("V",LETTERS[1:3])),sep = c(2,4))
你得到:
> df
VA VB VC
1 Aa Bb CC
2 AA BB CC
3 AA bb CC
在基础R中,您可以(从@ rawr的评论借用):
In base R you could do (borrowing from @rawr's comment):
l <- ave(as.character(df$a), FUN = function(x) strsplit(x, '(?<=..)', perl = TRUE))
df <- data.frame(do.call('rbind', l))
其中:
> df
X1 X2 X3
1 Aa Bb CC
2 AA BB CC
3 AA bb CC
这篇关于每两个字符多次分割字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!