通过正则表达式删除和替换字符 [英] Remove and replace characters by regex

查看:86
本文介绍了通过正则表达式删除和替换字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试写一个正则表达式来制作下一个东西:

I'm trying to write a regex that makes the next things:


  1. _ - >用空格替换

  2. + - >如果之后没有其他+,则删除它(即 c ++
    => c ++ c + - > c

  3. ' - >如果它在开始时移除它或这个词的结尾(即
    Alin的 - > Alin的'Alin's - > alin's

  4. & - - 请勿删除。

  5. 另一个特殊字符 - 删除

  1. _ -> replace it by a space
  2. + -> remove it if there is not another + after it (i.e. c++ => c++. c+ -> c)
  3. ' -> remove it if it's in the start or end of the word (i.e. Alin's -> Alin's. 'Alin's -> alin's)
  4. &, -, ., ! - Don't remove.
  5. Another special characters - remove

我想通过传递一个时间字符串

例如:

Input: "abc's, test_s! & c++ c+ 'Dirty's'. and beautiful'..."
Output: "abc's test s! & c++ c Dirty's. and beautiful..."

说明:

char `'` in `abc's,` stays because `3`
char `,` in `abc's,` was removed because `5` 
char `_` in `test_s!` was replaced by space because `1`
char `!` in `test_s!` is not removed because `!`
char `&` is not removed because `4`
char `+` in `c++` is not removed because `2`
char `+` in `c+` was removed because `2`
word: `'Dirty's'.` was replaced to `Dirty's.` because `3` and `4`
char `'` in `beautiful'...` was removed because `3`
char `.` is not removed because of `4`

这是我的 javascript 代码:

var str = "abc's test_s c++ c+ 'Dirty's'. and beautiful";
console.log(str);
str = str.replace(/[_]/g, " ");
str = str.replace(/[^a-zA-Z0-9 &-.!]/g, "");
console.log(str);

这是我的jsfiddle: http://jsfiddle.net/alonshmiel/ LKjYd / 4 /

This is my jsfiddle: http://jsfiddle.net/alonshmiel/LKjYd/4/

我不喜欢我的代码,因为我确信可以通过在字符串上运行一次来​​实现。

I don't like my code because I'm sure that it's possible to do it by running one time over the string.

任何帮助表示感谢!

推荐答案

function sanitize(str){

  return str.replace(/(_)|(\'\W|\'$)|(^\'|\W\')|(\+\+)|([a-zA-Z0-9\ \&\-\.\!\'])|(.)/g,function(car,p1,p2,p3,p4,p5,p6){

   if(p1) return " "; 
   if(p2) return sanitize(p2.slice(1));
   if(p3) return sanitize(p3.slice(0,-1)); 
   if(p4) return p4.slice(0,p4.length-p4.length%2); 
   if(p5) return car;
   if(p6) return ""; 
 });
}
document.querySelector('#sanitize').addEventListener('click',function(){
  
  document.querySelector('#output').innerHTML=      
	  sanitize(document.querySelector('#inputString').value);
});

#inputString{
  width:290px
}
#sanitize{
  background: #009afd;
  border: 1px solid #1777b7;
  border:none;
  color:#fff;
  cursor:pointer;
  height: 1.55em;
}

#output{
  background:#ddd;
  margin-top:5px;
  width:295px;
}

<input id="inputString" type="text" value="abc's test_s! & c++ c+ 'Dirty's'. and beau)'(tiful'..."/>
<input id="sanitize" type="button" value="Sanitize it!"" />
<div id="output" ></div>

一些要点:


  • 由于有义务消毒,一次通行限制未得到充分尊重用\W捕获的字符。我没有找到任何其他方法。

  • 关于++规则:如果损坏,任何+的序列都减少一个+。
  • $ b只有在旁边有非字母数字字符时才会删除$ b
  • 瑕疵。你应该怎么做,例如:abc'&。abc&或abc&?还有ab_'s。

  • one pass constraint is not fully respected, due to the obligation to sanitize the character captured with \W. I do not find any other way.
  • about the ++ rule: any sequence of + is reduced by one + if impair.
  • apostrophs are only removed if there is a non alphanumeric character next to it. What should you want to do with, for example: "abc'&". "abc&" or "abc'&"? And also for "ab_'s".

https://developer.m ozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects / String / replace#Specifying_a_function_as_a_parameter

这篇关于通过正则表达式删除和替换字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆