正则表达式替换重复的字符串模式 [英] Regex to replace a repeating string pattern
问题描述
我需要用每个基本构造单元替换单词中的重复模式.例如我有字符串TATATATA",我想用TA"替换它.此外,我可能会替换 2 次以上的重复,以避免替换正常单词.
I need to replace a repeated pattern within a word with each basic construct unit. For example I have the string "TATATATA" and I want to replace it with "TA". Also I would probably replace more than 2 repetitions to avoid replacing normal words.
我正在尝试使用 replaceAll 方法在 Java 中执行此操作.
I am trying to do it in Java with replaceAll method.
推荐答案
我想你想要这个(适用于任何长度的重复字符串):
I think you want this (works for any length of the repeated string):
String result = source.replaceAll("(.+)\\1+", "$1")
或者,优先考虑较短的匹配:
Or alternatively, to prioritize shorter matches:
String result = source.replaceAll("(.+?)\\1+", "$1")
它首先匹配一组字母,然后再次匹配(在匹配模式本身中使用反向引用).我试过了,它似乎奏效了.
It matches first a group of letters, and then it again (using back-reference within the match pattern itself). I tried it and it seems to do the trick.
示例
String source = "HEY HEY duuuuuuude what'''s up? Trololololo yeye .0.0.0";
System.out.println(source.replaceAll("(.+?)\\1+", "$1"));
// HEY dude what's up? Trolo ye .0
这篇关于正则表达式替换重复的字符串模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!