如何在Java中创建文章微调器正则表达式? [英] How to create article spinner regex in Java?
问题描述
比如说我想接受这句话:
Say for example I want to take this phrase:
{{Hello | What's Up | Howdy} {world | planet} |
{再见|稍后}
{people | citizen | inhabitants}}
{{Hello|What's Up|Howdy} {world|planet} | {Goodbye|Later} {people|citizens|inhabitants}}
并随机将其合并为一个以下内容:
and randomly make it into one of the following:
Hello world
Goodbye people
What's Up word
What's Up planet
Later citizens
etc.
基本思想是封闭在每一对中括号将是由|分隔的无限数量的选择。该程序需要经历并随机为每组括号选择一个选项。请记住,大括号可以在彼此之间无限地嵌套。我发现了一个关于这个的线程并尝试将其转换为Java,但它没有用。这是可以工作的python代码:
The basic idea is that enclosed within every pair of braces will be an unlimited number of choices separated by "|". The program needs to go through and randomly choose one choice for each set of braces. Keep in mind that braces can be nested endlessly within each other. I found a thread about this and tried to convert it to Java, but it did not work. Here is the python code that supposedly worked:
import re
from random import randint
def select(m):
choices = m.group(1).split('|')
return choices[randint(0, len(choices)-1)]
def spinner(s):
r = re.compile('{([^{}]*)}')
while True:
s, n = r.subn(select, s)
if n == 0: break
return s.strip()
这是我尝试将Python代码转换为Java。
Here is my attempt to convert that Python code to Java.
public String generateSpun(String text){
String spun = new String(text);
Pattern reg = Pattern.compile("{([^{}]*)}");
Matcher matcher = reg.matcher(spun);
while (matcher.find()){
spun = matcher.replaceFirst(select(matcher.group()));
}
return spun;
}
private String select(String m){
String[] choices = m.split("|");
Random random = new Random();
int index = random.nextInt(choices.length - 1);
return choices[index];
}
不幸的是,当我尝试通过调用
Unfortunately, when I try to test this by calling
generateAd("{{Hello|What's Up|Howdy} {world|planet} | {Goodbye|Later} {people|citizens|inhabitants}}");
在我的程序主要部分,它在generateSpun的行中给出了一个错误,其中Pattern reg是声明,给我一个PatternSyntaxException。
In the main of my program, it gives me an error in the line in generateSpun where Pattern reg is declared, giving me a PatternSyntaxException.
java.util.regex.PatternSyntaxException: Illegal repetition
{([^{}]*)}
有人可以尝试创建一个Java方法来做我想做的事情吗?
Can someone try to create a Java method that will do what I am trying to do?
推荐答案
以下是您当前代码的一些问题:
Here are some of the problems with your current code:
- 你应该重用你编译的
模式
,而不是Pattern.compile 每次code>
- 您应重复使用您的
随机
,而不是新的随机
每次 - 请注意
String.split
是基于正则表达式的,所以你必须split( \\\ |)
- 请注意,必须将Java正则表达式中的花括号转义为字面匹配,因此
Pattern.compile( \\ {([^ {}] *)\\});
- 您应该查询
组(1)
,而非group()
默认为分组0
- 您使用
replaceFirst
错误,查找Matcher.appendReplacement / Tail
而不是 -
Random.nextInt(int n)
独占上限(像Java中的许多这样的方法) - 算法本身实际上不能正确处理任意嵌套的括号
- You should reuse your compiled
Pattern
, instead ofPattern.compile
every time - You should reuse your
Random
, instead ofnew Random
every time - Be aware that
String.split
is regex-based, so you mustsplit("\\|")
- Be aware that curly braces in Java regex must be escaped to match literally, so
Pattern.compile("\\{([^{}]*)\\}");
- You should query
group(1)
, notgroup()
which defaults to group0
- You're using
replaceFirst
wrong, look upMatcher.appendReplacement/Tail
instead Random.nextInt(int n)
has exclusive upper bound (like many such methods in Java)- The algorithm itself actually does not handle arbitrarily nested braces properly
请注意,转义是由p完成的以 \
后退,并且作为Java字符串文字需要加倍(即\\
包含一个字符,反斜杠。)
Note that escaping is done by preceding with \
, and as a Java string literal it needs to be doubled (i.e. "\\"
contains a single character, the backslash).
- Source code and output with above fix but no major change to algorithm
这篇关于如何在Java中创建文章微调器正则表达式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!