如何删除Regex中字符串末尾的重复字符 [英] How to remove duplicate character at the end of a string in Regex

查看:74
本文介绍了如何删除Regex中字符串末尾的重复字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

任何人都可以帮助我使用以下正则表达式

Can anyone help me with the following regex

<script type="text/javascript">
        function quoteWords() {
            var search = document.getElementById("search_box");
            search.value = search.value.replace(/^\s*|\s*$/g, ""); //trim string of ending and beginning whitespace
            if(search.value.indexOf(" ") != -1){ //if more then one word
                search.value = search.value.replace(/^"*|"*$/g, "\"");
            }
        }
  </script>

<input type="text" name="keywords" value="" id="search_box" size="17">
<input onClick="quoteWords()" type="submit" value="Go">

问题:手动添加双引号并按提交时会中断,在最后输入一个额外的双引号。正则表达式代码应该看看是否存在双引号,它不应该添加任何东西。

Issue : It breaks when manually adding double quotes and pressing submit, one extra double quote is entered at the end. The regex code should see if the double quotes exist, it should not add any thing.

所以它使足够长足够长< - 它在末尾添加了额外的双引号

So it makes "long enough" to "long enough"" <- it adds an extra double quote at the end

任何人都可以查看正则表达式代码,看看如何解决这个问题。

Can anyone check the regex code so see how to solve this issue.

我只希望插入一次双引号。

I only want the double quotes to be inserted once.

推荐答案

错误肯定发生在这一行:

The error is definitely happening in this line:

search.value = search.value.replace(/^"*|"*$/g, "\"");

这是因为*匹配0或更多报价。但是,你可能不会我不想用+替换它,因为那不会做你想要的双引号字符串中包含空格的工作。

And it is due to the fact that "* matches 0 or more quotes. However, you presumably wouldn't want to just replace it with "+ since that wouldn't do the job you wanted of double-quoting strings with spaces in them.

你可能只是想要在两个陈述中做这样的事情:

You probably just want to do something like this, in two statements:

search.value = search.value.replace(/^"*|"*$/g, '')
search.value = '"' + search.value + '"'

部分关键是没有字符串结束字符要消耗 - 正则表达式引擎只知道它何时位于字符串的末尾。因此,在匹配字符串末尾的引号后,光标只会移动到字符串的末尾,并且在从字符串中删除之前会再次找到空字符串。因此,字符串末尾的引号被引号替换,字符串末尾的nothing也被引号替换。

Part of the key is that there is no 'end of string' character to consume - the regex engine 'just knows' when it is at the end of the string. So after matching a quote at the end of the string, the cursor just moves to the end of the string, and it finds the empty string one more time before falling off the string. Thus, the quote at the end of the string is replaced by a quote, and the 'nothing' at the end of the string is also replaced by a quote.

I建议在 http://上查看ECMAScript规范www.ecma-international.org/publications/files/ECMA-ST/ECMA-262.pdf 第15.5.4.10节和第15.5.4.11节。但是,我还在这个要点上提供了一个直观的说明。

I recommend taking a look at the ECMAScript spec at http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-262.pdf sections 15.5.4.10 and 15.5.4.11 yourself. However, I've also provided an intuitive illustration of how this works at this gist.

编辑

由于人们似乎对为什么会发生这种情况感到困惑,这里可能有所帮助:

Since people seem confused as to why this would happen, here's something that might help:

http:/ /www.grymoire.com/Unix/Sed.html#uh-6

这是来自sed的文档,但它解释了为什么组合*和/ g是个坏主意。事实上,当你这样做时,JS不会爆炸,这是一个有利于它的标志。请注意,字符串中的每个位置都有无限数量的0个字符。

That's from the documentation for sed, but it explains why combining * and /g is a bad idea. The fact that JS doesn't just explode when you do that is a mark in its favor. Note that there are an infinite number of '0 characters' at every position in the string.

这篇关于如何删除Regex中字符串末尾的重复字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆