使用 R 查找字符串中的重复模式 [英] Find repeated pattern in a string of characters using R

查看：33 发布时间：2021/7/6 20:13:18 regex r string

本文介绍了使用 R 查找字符串中的重复模式的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个包含表达式的大文本，例如:"aaaahahahahaha that was a good chance". 处理后，我希望 "aaaaahahahaha" 消失，或者至少，将其更改为简单的 "ha".

I have a large text that contains expressions such as: "aaaahahahahaha that was a good joke". after processing, I want the "aaaaahahahaha" to disappear, or at least, change it to simply "ha".

目前，我正在使用这个:

At the moment, I am using this:

gsub('(.+?)\\1', '', str)

当带有模式的字符串位于句子的开头时，此方法有效，但不包括 where 位于其他任何地方.所以:

This works when the string with the pattern is at the beginning of the sentence, but not where is located anywhere else. So:

str <- "aaaahahahahaha that was a good joke"
gsub('(.+?)\\1', '', str)
#[1] "ha that was a good joke"`

但是

 str <- "that was aaaahahahahaha a good joke"
 gsub('(.+?)\\1', '', str)
#[1] "that was aaaahahahahaha a good joke"

这个问题可能与此有关:找到重复的模式python，但我在 R 中找不到等价物.

This question might relate to this: find repeated pattern in python, but I can't find the equivalence in R.

我假设很简单，也许我遗漏了一些微不足道的东西，但是由于正则表达式不是我的强项，而且我已经尝试了很多不起作用的东西，我想知道是否有人可以帮助我.问题是:如何在 R 中查找和替换字符串中重复的模式?

I am assuming is very simple and perhaps I am missing something trivial, but since regular expressions are not my strength and I have already tried a bunch of things that have not worked, I was wondering if someone could help me. The question is: How to find, and substitute, repeated patterns in a string of characters in R?

提前感谢您的时间.

使用 R 查找字符串中的重复模式 [英] Find repeated pattern in a string of characters using R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用 R 查找字符串中的重复模式 [英] Find repeated pattern in a string of characters using R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭