Javascript - 正则表达式 - 如何删除指定长度的单词 [英] Javascript - regex - how to remove words with specified length

查看:42
本文介绍了Javascript - 正则表达式 - 如何删除指定长度的单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的情况下,字长是2",我正在使用这个正则表达式:

In my case word length is "2" and I am using this regex:

text = text.replace(/\b[a-zA-ZΆ-ώἀ-ῼ]{2}\b/g, '') );

但不能使其与希腊字符一起使用.为了您的方便,这里有一个演示:

but cannot make it work with greek characters. For your convenience here is a demo:

text = 'English: the on in to of \n Greek: πως θα το πω';
text = text.replace(/\b[0-9a-zA-ZΆ-ώἀ-ῼ]{2}\b/g, '');
console.log(text);

就希腊字​​符而言,我尝试使用包含 2 组的范围:希腊语和科普特语"和希腊语扩展"(如 unicode-table.com).

As far as the greek characters are concerned, I try to use a range with 2 sets: "Greek and Coptic" and "Greek Extended" (as seen on unicode-table.com).

推荐答案

希腊字符的问题是因为 \b.您可以在这里查看:Javascript - 正则表达式 - 词边界 (\b) 问题 其中@Casimir et Hippolyte 提出以下解决方案:

The problem with greek characters is because of \b. You can take a look here: Javascript - regex - word boundary (\b) issue where @Casimir et Hippolyte proposes the following solution:

由于 Javascript 没有后视功能,并且由于单词边界仅适用于 \w 字符类的成员,因此唯一的方法是使用组(如果要替换,则捕获组):

Since Javascript doesn't have the lookbehind feature and since word boundaries work only with members of the \w character class, the only way is to use groups (and capturing groups if you want to make a replacement):

//example to remove 2 letter words:
txt = txt.replace(/(^|[^a-zA-ZΆΈ-ώἀ-ῼ\n])([a-zA-ZΆΈ-ώἀ-ῼ]{2})(?![a-zA-ZΆΈ-ώἀ-ῼ])/gm, '\1');

我还在第一个和第三个匹配项中添加了 0-9 因为它删除了诸如2TB"或mp3"之类的词

I also added 0-9 inside the first and the third match because it was removing words like "2TB" or "mp3"

这篇关于Javascript - 正则表达式 - 如何删除指定长度的单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆