将数据框中的每个x字符拆分为字符串 [英] split string each x characters in dataframe

查看：73 发布时间：2020/9/30 23:58:12 r string split character

本文介绍了将数据框中的每个x字符拆分为字符串的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我知道这里有一些关于每个 nth 个字符拆分字符串的答案，例如这一个和这一个，但是这些都是针对特定问题的，并且大多与单个字符串相关，而不与多个字符串的数据帧相关。

I know there are some answers here about splitting a string every nth character, such as this one and this one, However these are pretty question specific and mostly related to a single string and not to a data frame of multiple strings.

示例数据

df <- data.frame(id = 1:2, seq = c('ABCDEFGHI', 'ZABCDJHIA'))

看起来像这样：

  id       seq
1  1 ABCDEFGHI
2  2 ZABCDJHIA

分割第三个字符

我想分割每行一个字符串，每个第三个字符，这样生成的数据帧如下所示：

I want to split the string in each row every thrid character, such that the resulting data frame looks like this:

id  1   2   3
1   ABC DEF GHI
2   ZAB CDJ HIA

我尝试过的事情

在将字符串拆分为单个字符之前，我使用了 splitstackshape ，例如： df％>％cSplit（'seq'，sep =``，stripWhite = FALSE，type.convert = FALSE）我希望拥有类似的功能（或者cSplit可能）

I used the splitstackshape before to split a string on a single character, like so: df %>% cSplit('seq', sep = '', stripWhite = FALSE, type.convert = FALSE) I would love to have a similar function (or perhaps it is possbile with cSplit) to split on every third character.

推荐答案

一个选项是分开

library(tidyverse)
df %>%
    separate(seq, into = paste0("x", 1:3), sep = c(3, 6))
# id  x1  x2  x3
#1  1 ABC DEF GHI
#2  2 ZAB CDJ HIA

如果我们想创建更通用的

If we want to create it more generic

n1 <- nchar(as.character(df$seq[1])) - 3
s1 <- seq(3, n1, by = 3)
nm1 <- paste0("x", seq_len(length(s1) +1))
df %>% 
    separate(seq, into = nm1, sep = s1)

或者使用 base R ，使用 strsplit ，通过将正则表达式环视传递到列表然后是 rbind 列表元素






Or using base R, using strsplit, split the 'seq' column for each instance of 3 characters by passing a regex lookaround into a list and then rbind the list elements
df[paste0("x", 1:3)] <- do.call(rbind, 
           strsplit(as.character(df$seq), "(?<=.{3})", perl = TRUE))

注意：最好避免以非标准标签开头的列名，例如数字。因此，请在名称的开头加上 x 
NOTE: It is better to avoid column names that start with non-standard labels such as numbers.  For that reason, appended 'x' at the beginning of the names

                        这篇关于将数据框中的每个x字符拆分为字符串的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

将数据框中的每个x字符拆分为字符串 [英] split string each x characters in dataframe

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

将数据框中的每个x字符拆分为字符串 [英] split string each x characters in dataframe

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭