如何使用正则表达式检查字符串是回文? [英] How to check that a string is a palindrome using regular expressions?

查看:237
本文介绍了如何使用正则表达式检查字符串是回文?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

那是我无法回答的面试问题:

That was an interview question that I was unable to answer:

如何使用正则表达式检查字符串是回文?

How to check that a string is a palindrome using regular expressions?

p.s.已经存在一个问题"如何检查给定的字符串是否为palindrome?",它会以不同的语言给出很多答案,但没有使用正则表达式的答案.

p.s. There is already a question "How to check if the given string is palindrome?" and it gives a lot of answers in different languages, but no answer that uses regular expressions.

推荐答案

此问题的答案是这是不可能的".更具体地说,面试官想知道您是否在计算理论课上给予了关注.

The answer to this question is that "it is impossible". More specifically, the interviewer is wondering if you paid attention in your computational theory class.

在计算理论课上,您学习了有限状态机.有限状态机由节点和边组成.每个边都用来自有限字母的字母注释.一个或多个节点是特殊的接受"节点,一个节点是开始"节点.从给定单词中读取每个字母时,我们会在机器中遍历给定边沿.如果我们最终处于接受状态,那么我们就说机器接受"了这个词.

In your computational theory class you learned about finite state machines. A finite state machine is composed of nodes and edges. Each edge is annotated with a letter from a finite alphabet. One or more nodes are special "accepting" nodes and one node is the "start" node. As each letter is read from a given word we traverse the given edge in the machine. If we end up in an accepting state then we say that the machine "accepts" that word.

正则表达式始终可以转换为等效的有限状态机.也就是说,可以接受和拒绝与正则表达式相同的单词的单词(在现实世界中,某些正则表达式语言允许使用任意函数,而这些函数不起作用).

A regular expression can always be translated into an equivalent finite state machine. That is, one that accepts and rejects the same words as the regular expression (in the real world, some regexp languages allow for arbitrary functions, these don't count).

不可能建立一个接受所有回文的有限状态机.证明基于以下事实:我们可以轻松构建需要任意数量节点的字符串,即字符串

It is impossible to build a finite state machine that accepts all palindromes. The proof relies on the facts that we can easily build a string that requires an arbitrarily large number of nodes, namely the string

a ^ x b a ^ x(例如aba,aabaa,aaabaaa,aaaabaaaa等)

a^x b a^x (eg., aba, aabaa, aaabaaa, aaaabaaaa, ....)

其中a ^ x是重复的x次.这至少需要x个节点,因为在看到'b'之后,我们必须倒数x次以确保它是回文.

where a^x is a repeated x times. This requires at least x nodes because, after seeing the 'b' we have to count back x times to make sure it is a palindrome.

最后,回到原始问题,您可以告诉访问者您可以编写一个正则表达式,该表达式接受所有小于一定固定长度的回文.如果现实世界中有需要识别回文的应用程序,那么几乎可以肯定不会包括任意长的回文,因此,该答案将表明您可以将理论上的不可能性与现实世界中的应用区分开.尽管如此,实际的正则表达式仍将是相当长的,比等效的4行程序要长得多(对读者而言,这很容易练习:编写一个识别回文的程序).

Finally, getting back to the original question, you could tell the interviewer that you can write a regular expression that accepts all palindromes that are smaller than some finite fixed length. If there is ever a real-world application that requires identifying palindromes then it will almost certainly not include arbitrarily long ones, thus this answer would show that you can differentiate theoretical impossibilities from real-world applications. Still, the actual regexp would be quite long, much longer than equivalent 4-line program (easy exercise for the reader: write a program that identifies palindromes).

这篇关于如何使用正则表达式检查字符串是回文?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆