使用正则表达式生成字符串而不是匹配它们 [英] Using Regex to generate Strings rather than match them
问题描述
我正在编写一个 Java 实用程序,它可以帮助我生成大量数据以进行性能测试.能够为字符串指定一个正则表达式,以便我的生成器吐出与此匹配的内容,这将是真的很酷.有没有已经烤好的东西可以用来做这个?或者有没有图书馆可以让我大部分时间去那里?
谢谢
有关此问题的建议库的完整列表:
* - 取决于 dk.brics.automaton
正如评论中提到的,谷歌代码中有一个可用的库来实现这一点:https://code.google.com/archive/p/xeger/>
另请参阅 https://github.com/mifmif/Generex,如 Mifmif
原始消息:
首先,使用足够复杂的正则表达式,我相信这是不可能的.但是您应该能够为简单的正则表达式组合一些东西.
如果您查看类 java.util.regex.Pattern 的源代码,您会看到它使用 Node 实例的内部表示.每个不同的模式组件都有自己的 Node 子类实现.这些节点被组织成一棵树.
通过生成遍历这棵树的访问者,您应该能够调用重载的生成器方法或某种将某些东西拼凑在一起的构建器.
I am writing a Java utility which helps me to generate loads of data for performance testing. It would be really cool to be able to specify a regex for Strings so that my generator spits out things which match this. Is there something out there already baked which I can use to do this? Or is there a library which gets me most of the way there?
Thanks
Edit:
Complete list of suggested libraries on this question:
* - Depends on dk.brics.automaton
Edit: As mentioned in the comments, there is a library available at Google Code to achieve this: https://code.google.com/archive/p/xeger/
See also https://github.com/mifmif/Generex as suggested by Mifmif
Original message:
Firstly, with a complex enough regexp, I believe this can be impossible. But you should be able to put something together for simple regexps.
If you take a look at the source code of the class java.util.regex.Pattern, you'll see that it uses an internal representation of Node instances. Each of the different pattern components have their own implementation of a Node subclass. These Nodes are organised into a tree.
By producing a visitor that traverses this tree, you should be able to call an overloaded generator method or some kind of Builder that cobbles something together.
这篇关于使用正则表达式生成字符串而不是匹配它们的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!