如何在Java中创建文章微调器正则表达式? [英] How to create article spinner regex in Java?

查看:171
本文介绍了如何在Java中创建文章微调器正则表达式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

比如说我想接受这句话:

Say for example I want to take this phrase:


{{Hello | What's Up | Howdy} {world | planet} |
{再见|稍后}
{people | citizen | inhabitants}}

{{Hello|What's Up|Howdy} {world|planet} | {Goodbye|Later} {people|citizens|inhabitants}}

并随机将其合并为一个以下内容:

and randomly make it into one of the following:

Hello world
Goodbye people
What's Up word
What's Up planet
Later citizens
etc.

基本思想是封闭在每一对中括号将是由|分隔的无限数量的选择。该程序需要经历并随机为每组括号选择一个选项。请记住,大括号可以在彼此之间无限地嵌套。我发现了一个关于这个的线程并尝试将其转换为Java,但它没有用。这是可以工作的python代码:

The basic idea is that enclosed within every pair of braces will be an unlimited number of choices separated by "|". The program needs to go through and randomly choose one choice for each set of braces. Keep in mind that braces can be nested endlessly within each other. I found a thread about this and tried to convert it to Java, but it did not work. Here is the python code that supposedly worked:

import re
from random import randint

def select(m):
    choices = m.group(1).split('|')
    return choices[randint(0, len(choices)-1)]

def spinner(s):
    r = re.compile('{([^{}]*)}')
    while True:
        s, n = r.subn(select, s)
        if n == 0: break
    return s.strip()

这是我尝试将Python代码转换为Java。

Here is my attempt to convert that Python code to Java.

public String generateSpun(String text){
    String spun = new String(text);
    Pattern reg = Pattern.compile("{([^{}]*)}");
    Matcher matcher = reg.matcher(spun);
    while (matcher.find()){
       spun = matcher.replaceFirst(select(matcher.group()));
    }
    return spun;
}

private String select(String m){
    String[] choices = m.split("|");
    Random random = new Random();
    int index = random.nextInt(choices.length - 1);
    return choices[index];
}

不幸的是,当我尝试通过调用

Unfortunately, when I try to test this by calling

generateAd("{{Hello|What's Up|Howdy} {world|planet} | {Goodbye|Later} {people|citizens|inhabitants}}");

在我的程序主要部分,它在generateSpun的行中给出了一个错误,其中Pattern reg是声明,给我一个PatternSyntaxException。

In the main of my program, it gives me an error in the line in generateSpun where Pattern reg is declared, giving me a PatternSyntaxException.

java.util.regex.PatternSyntaxException: Illegal repetition
{([^{}]*)}

有人可以尝试创建一个Java方法来做我想做的事情吗?

Can someone try to create a Java method that will do what I am trying to do?

推荐答案

以下是您当前代码的一些问题:

Here are some of the problems with your current code:


  • 你应该重用你编译的模式,而不是 Pattern.compile

  • 您应重复使用您的随机 ,而不是新的随机 每次

  • 请注意 String.split 是基于正则表达式的,所以你必须 split( \\\ |)

  • 请注意,必须将Java正则表达式中的花括号转义为字面匹配,因此 Pattern.compile( \\ {([^ {}] *)\\});

  • 您应该查询组(1),而非 group()默认为分组 0

  • 您使用 replaceFirst 错误,查找 Matcher.appendReplacement / Tail 而不是

  • Random.nextInt(int n) 独占上限(像Java中的许多这样的方法)

  • 算法本身实际上不能正确处理任意嵌套的括号

  • You should reuse your compiled Pattern, instead of Pattern.compile every time
  • You should reuse your Random, instead of new Random every time
  • Be aware that String.split is regex-based, so you must split("\\|")
  • Be aware that curly braces in Java regex must be escaped to match literally, so Pattern.compile("\\{([^{}]*)\\}");
  • You should query group(1), not group() which defaults to group 0
  • You're using replaceFirst wrong, look up Matcher.appendReplacement/Tail instead
  • Random.nextInt(int n) has exclusive upper bound (like many such methods in Java)
  • The algorithm itself actually does not handle arbitrarily nested braces properly

请注意,转义是由p完成的以 \ 后退,并且作为Java字符串文字需要加倍(即\\包含一个字符,反斜杠。)

Note that escaping is done by preceding with \, and as a Java string literal it needs to be doubled (i.e. "\\" contains a single character, the backslash).

  • Source code and output with above fix but no major change to algorithm

这篇关于如何在Java中创建文章微调器正则表达式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆