使用正则表达式搜索并在文件中插入字符串 [英] Search and insert a string in a file using regex

查看:107
本文介绍了使用正则表达式搜索并在文件中插入字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好,

我想验证一个文件,以便该文件中不存在嵌套的花括号.如果存在,则认为用户忘记了大括号(即文件不正确).我的代码应在最后一次出现分号后插入一个右括号.类似于以下内容:如果文件包含

abc {def; ghe; ijk {lmn}

然后应该在更正之后:abc {def; ghe;} ijk {lmn}

问题1:我用于验证文件的代码就像

Hello all,

I want to verify a file so that no nested braces should be present in that file. If present it is supposed that user forgot to give closing brace(i.e. file is not correct). And my code should insert a closing brace after the last occurrence of semicolon. Like the following: If the file contains

abc{def;ghe;ijk{lmn}

Then this should be after correction: abc{def;ghe;}ijk{lmn}

Problem1: My code for verification of the file is like

Regex regex = new Regex(@"\{(.*?[^{][^}])\{");

Match match = regex.Match(streamReader.ReadToEnd());

                if (match.Success)

//report error and correct the file

这适用于单行语句(即"{def; ghe; ijk {"在一行中或在下一行中(即"{def; ghe; ijk"在一行中,而"{"在下一行中)).但是,如果我在ghe; ijk和{之间给出2个换行符,则该模式不匹配.

帮我找到正确的正则表达式.

问题2:如何在匹配的模式中最后一次出现分号后插入右括号,并将其写入同一文件?我想用正则表达式来做.我的意思是,这些步骤将像是:在两个不包含任何打开或关闭大括号的开括号之间获取内容.然后将检查内容中最后一次出现的分号,并在其后添加一个右括号.

请帮帮我!!!!

This is working for single line statement(i.e. if "{def;ghe;ijk{" is in one line or in the next line(i.e. "{def;ghe;ijk" is in one line and "{" is on the next line)). But if i give 2 line breaks between ghe;ijk and { then the pattern is not matching.

Help me find the corrected regex.

Problem2: How can I insert a closing brace after the last occurrence of a semicolon in the matched pattern and write it into the same file? I want to do it using regex. I mean,the steps will be like: getting the content between 2 opening braces which doesn''t contain any opening or closing braces. And then the content will be checked for the last occurrence of semicolon and add a closing brace after it.

Please Help me out !!!

推荐答案

正则表达式是用于此工作的错误工具.

最简单的方法就是最简单的方法.

扫描文件,一次将一个字符读入缓冲区.

如果看到开放括号,请设置一个标志.

如果看到分号,则将缓冲区写到临时文件中.

如果看到括号,请将缓冲区写到临时文件中,然后取消设置标志.

如果在设置该标志时看到一个右括号,请将一个右括号写入temp-filem,然后将缓冲区写入temp文件,并设置一个标志以记住您已在文件中插入了某些内容.

当您到达文件末尾时,写出缓冲区.如果您有一个无与伦比的大括号,请写出一个大括号.

如果您在文件中插入了任何内容,请删除原始文件,然后将临时文件重命名为原始文件.如果您没有插入任何内容,请删除临时文件.

不要尝试使用正则表达式-可能会这样做,但是它并不漂亮,调试,修复或添加其他功能也不容易. (而且,如果您实际上必须修改文件,效率不高.)
Regex is the wrong tool for the job.

The easiest way to do it is the simplest.

Scan through the file, reading one character at a time into a buffer.

Set a flag if you see an open-brace.

If you see a semi-colon write out the buffer to a temp file.

If you see a closed-brace write out the buffer to the temp file and unset your flag.

If you see an open-brace while the flag is set, write a close-brace to the temp-filem then write the buffer to the temp file and set a flag to remember that you''ve inserted something in the file.

When you get to the end of the file, write out the buffer. If you have an unmatched open-brace, write out a close brace.

If you inserted anything into the file, delete the original file and rename the temp file to the original. If you didn''t insert anything, delete the temp file.

Don''t try to do it with regex -- it might be possible, but it''s not pretty and it''s not easy to debug or fix or add additional functionality to. (And it''s not as efficient if you actually have to modify the file.)


现在,这应该可以解决您的问题#1和#2:

Now, this should solve your problem #1 and #2:

        static void Main(string[] args)
        {
            Verify("abc{def;ghe;ijk{lmn}");
            Verify(@"{def;ghe;ijk{
{def;ghe;ijk");
            Verify("");
        }
        public static string Verify(string data)
        {
            StringBuilder sb = new StringBuilder();
            //           1             2           3           4        5
            string p = @"(\{[^}{]*?\})|(\{[^}{]*;)|(\{[^}{;]*)|([^}{]+)|(\})";
            Regex rex = new Regex(p, RegexOptions.Compiled);
            foreach (Match m in rex.Matches(data).Cast<Match>())
            {
                if (m.Groups[2].Success
                    || m.Groups[3].Success
                    || m.Groups[5].Success)
                    Console.WriteLine("fixing error");

                if (m.Groups[2].Success)
                    sb.AppendFormat("{0}{1}", m.Groups[2].Value, "}");
                else if (m.Groups[3].Success)
                    sb.AppendFormat("{0}{1}", m.Groups[3].Value, "}");
                else if (m.Groups[4].Success)
                    sb.Append(m.Groups[4].Value);
                else if (m.Groups[5].Success)
                    sb.Append("{}");
                else
                    sb.Append(m.Groups[1].Value);
            }
            string s = sb.ToString();

            Console.WriteLine("from {0}", data);
            Console.WriteLine("to   {0}", s);

            return s;
        }



干杯

Andi



Cheers

Andi


正则表达式无法处理嵌套问题(根据定义).

您需要标记流并通过粘贴标记来处理嵌套.令牌化可以由Regex完成.

在简单情况下,您具有简单的字符流.使用soulton 1中建议的方法.

如果文件是一种编程语言,则必须根据该语言进行标记化,以获得合理的结果.

对于成对元素,可以如下标记:
1.评论
2.字符串文字
3.字符文字
4. {和}
5.休息(给定问题的不相关字符)

然后,您编写一个简单的解析器,一个接一个地吃掉所有令牌,并以{开头增加,以}递减.

后置条件:计数器== 0或错误.

这是C#的代码:

Regex can not handle nested problems (by definition).

You need to tokenize the stream and handle the nesting by pasing the tokens. The tokenizing can be done by Regex, though.

In the simples situation, you have simple stream of characters. Use the approach as suggested in soulton 1.

If the file is a programming language, you must tokenize according to the language in order to get reasonable results.

For your case of pair-wise elements, you could tokenize as follows:
1. comments
2. string literals
3. character literals
4. { and }
5. rest (individual irrelevant characters for the given problem)

Then you write a simple parser that eats up all tokens one after the other and increment with the opening { and decrement with the closing } .

Postcondition: counter == 0 or error.

And here comes the code for C#:

string file = @"your-full-path-to-thissource-file.cs";
string data = File.ReadAllText(file);

string cmt = @"//.*?


这篇关于使用正则表达式搜索并在文件中插入字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆