如何使用正则表达式从字符串中删除多个单词? [英] How to remove more than one word from a string with regex ?

查看:69
本文介绍了如何使用正则表达式从字符串中删除多个单词?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有一堆句子的.txt文件和一个带有我要从第一个文件中删除的单词的.txt文件。我刚刚开始玩正则表达式而且我不太确定该怎么做



我尝试过:



我尝试过以下方法,但对我来说无效。

I have a .txt file with a bunch of sentences and a .txt file with words I want to remove from the first file. I just started playing around regex and im not quite sure how to do it

What I have tried:

I have tried the following method which did not work for me.

class Program
{
    static void Main(string[] args)
    {
        Program p = new Program();
        const string CFd = "..\\..\\Duomenys.txt";
        const string CFr = "..\\..\\Rezultatai.txt";
        const string Remove = "..\\..\\NorimiZodziai.txt";
        string lines = ReadingText(CFd);
        string removeW = ReadingText(Remove);                    //Words i want to remove from CFd file
        string replacedLines = Replaceing(lines, removeW);
        p.Printing(replacedLines,CFr);
    }

    static string ReadingText(string CFd)
    {
        string lines = File.ReadAllText(CFd, Encoding.GetEncoding(1257));
        return lines;
    }
    static string Replaceing(string lines, string removeW)
    {
        string pattern = @"\b"+removeW+"\b";
        string output = Regex.Replace(lines, pattern, "");
        return output;
    }
    void Printing (string replacedLines, string file)
    {
        using (StreamWriter writer = new StreamWriter(file))
        {
            writer.WriteLine(replacedLines);
        }
    }
}

推荐答案

你不需要正则表达式为此:string.Replace也可以完成这项工作(而且更快,正则表达式是一个通用的字符串处理器,所以它没有专用方法那么快。

String.Replace Method(System)| Microsoft Docs [ ^ ]
You don't need a regex for that: string.Replace will do the job just as well (and quicker, regex is a general purpose string processor, so it isn't as fast as a dedicated method.
String.Replace Method (System) | Microsoft Docs[^]


Quote:

我尝试过以下方法,但是没有用对我来说。

I have tried the following method which did not work for me.



最好展示一下它是如何工作的例子。


It is a good idea to show examples of how it don't work.

引用:

我有一个.txt文件一堆句子和一个.txt文件,里面有我要从第一个文件中删除的单词。

I have a .txt file with a bunch of sentences and a .txt file with words I want to remove from the first file.



删除单词比你做的要复杂得多。


Removing words is a little more complicated than what you did.

string pattern = @"\b"+removeW+"\b";



在一个句子中,没有必要在空格之间插入一个单词,你可以拥有,。 ? !如果单词在字符串中是第一个或最后一个,则没有任何内容。

另一个问题是当你删除单词时,你不想删除单词周围的两个空格,你需要保留1个空格。

因此,用空格替换已经有了改进:


In a sentence, a word is not necessary embedded between spaces, you can have , . ? ! and nothing if the word is first or last in string.
Another problem is that when you remove a word, you don't want to remove both spaces around the word, you need to keep 1 space.
So replacing by a space will already an improvement:

string pattern = @"\b"+removeW+"\b";
string output = Regex.Replace(lines, pattern, " ");



您需要定义当单词不在空格之间时要执行的操作,然后推断它如何转换为代码。

-----

只是一些有趣的链接,可帮助构建和调试RegEx。

以下是RegEx文档的链接:

< a href =http://perldoc.perl.org/perlre.html> perlre - perldoc.perl.org [ ^ ]

以下是帮助构建RegEx并调试它们的工具的链接:

.NET正则表达式测试程序 - 正则表达式风暴 [ ^ ]

Expresso正则表达式工具 [ ^ ]

RegExr:Learn,Build,&测试RegEx [ ^ ]

在线正则表达式测试器和调试器:PHP,PCRE,Python,Golang和JavaScript [ ^ ]

这个显示RegEx是一个很好的图表,它非常有助于理解RegEx的作用:

Debuggex:在线可视正则表达式测试器。 JavaScript,Python和PCRE。 [ ^ ]

这个网站也在一个漂亮的图表中显示正则表达式,但无法测试与RegEx匹配的内容:

Regexper [ ^ ]


You need to define what to do when a word is not between spaces, then deduce how it translate into code.
-----
Just a few interesting links to help building and debugging RegEx.
Here is a link to RegEx documentation:
perlre - perldoc.perl.org[^]
Here is links to tools to help build RegEx and debug them:
.NET Regex Tester - Regex Storm[^]
Expresso Regular Expression Tool[^]
RegExr: Learn, Build, & Test RegEx[^]
Online regex tester and debugger: PHP, PCRE, Python, Golang and JavaScript[^]
This one show you the RegEx as a nice graph which is really helpful to understand what is doing a RegEx:
Debuggex: Online visual regex tester. JavaScript, Python, and PCRE.[^]
This site also show the Regex in a nice graph but can't test what match the RegEx:
Regexper[^]


谢谢你们,我修好了,然而,在删除单词后,还有空格和分隔符,有什么方法可以删除它们吗?
Thank you guys, i fixed it, however, after the words are removed, there are blank spaces and separators left, is there any possible way I can remove them ?


这篇关于如何使用正则表达式从字符串中删除多个单词?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆