std regex_search只匹配当前行 [英] std regex_search to match only current line

查看:58
本文介绍了std regex_search只匹配当前行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用各种正则表达式逐行解析C源文件.首先,我以字符串形式读取文件的所有内容:

I use a various regexes to parse a C source file, line by line. First i read all the content of file in a string:

ifstream file_stream("commented.cpp",ifstream::binary);

std::string txt((std::istreambuf_iterator<char>(file_stream)),
std::istreambuf_iterator<char>());

然后我使用一组正则表达式,应该连续应用它直到找到匹配项,这里我仅给出一个例子:

Then i use a set of regex, which should be applied continusly until the match found, here i will give only one for example:

vector<regex> rules = { regex("^//[^\n]*$") };

char * search =(char*)txt.c_str();

int position = 0, length = 0;

for (int i = 0; i < rules.size(); i++) {
  cmatch match;

  if (regex_search(search + position, match, rules[i],regex_constants::match_not_bol | regex_constants::match_not_eol)) 
  {
     position += ( match.position() + match.length() );        
  }

}

但是它不起作用.它将与不在当前行中的注释匹配,但是它将搜索整个字符串,对于第一个匹配, regex_constants :: match_not_bol regex_constants :: match_not_eol 应该使 regex_search 只能将 ^ $ 识别为行的开始/结束,而不是整个块的开始/结束.这是我的文件:

But it don't work. It will match the comment not in the current line, but it will search whole string, for the first match, regex_constants::match_not_bol and regex_constants::match_not_eol should make the regex_search to recognize ^$ as start/end of line only, not end start/end of whole block. So here is my file:

commented.cpp:

#include <stdio.h>
//comment

代码应该失败,我的逻辑是使用regex_search的那些选项,匹配应该失败,因为它应该在第一行中搜索模式:

The code should fail, my logic is with those options to regex_search, the match should fail, because it should search for pattern in the first line:

#include <stdio.h>

但是,它搜索整个字符串,并立即找到//comment .我需要帮助,以使 regex_search 仅在当前行中匹配.选项 match_not_bol match_not_eol 对我没有帮助.当然,我可以在向量中逐行读取文件,然后对向量中的每个字符串进行所有规则的匹配,但是它非常慢,我这样做了,而且解析一个大文件需要花费很长时间.那就是为什么我要让正则表达式处理新行并使用定位计数器.

But instead it searches whole string, and immideatly finds //comment. I need help, to make regex_search match only in current line. The options match_not_bol and match_not_eol do not help me. Of course i can read a file line by line in a vector, and then do match of all rules on each string in vector, but it is very slow, i have done that, and it take too long time to parse a big file like that, that's why i want to let regex deal with new lines, and use positioning counter.

推荐答案

如果这不是您想要的,请发表评论,以便我删除答案

If it is not what you want please comment so I will delete the answer

您正在做的事情不是使用正则表达式库的正确方法.
因此,这是我对任何想要使用 std :: regex 库的人的建议.

What you are doing is not a correct way of using a regex library.
Thus here is my suggestion for anyone that wants to use std::regex library.

  1. 它仅支持 ECMAScript ,比所有现代的 regex 库都要差.
  2. 它有尽可能多的错误(我发现):

  1. It only supports ECMAScript that somehow is a little poor than all modern regex library.
  2. It has bugs as many as you like ( just I found ):

  1. 相同的正则表达式却不同仅在Linux和Windows C ++上运行结果
  2. std :: regex和忽略标志
  3. std :: regex_match和具有奇怪行为的懒惰量词

  • 在某些情况下(我专门使用 std :: match_results 进行测试),与 std.regex 相比,速度要慢 200 倍.>以语言

  • In some cases (I test specifically with std::match_results ) It is 200 times slower in comparison to std.regex in d language

    结论:根本不要使用它.

    conclusion: do not use it at all.

    但是,如果有人仍然要求使用则您可以:

    But if anyone still demands to use c++ anyway then you can:

    1. 使用 boost :: regex 关于增强库 ,因为:

    1. use boost::regex about Boost library because:

    1. 这是 PCRE 支持
    2. 它的bug少(我没看过)
    3. 它在 bin 文件中较小(我是指编译后的可执行文件)
    4. std :: regex
    5. 更快
    1. It is PCRE support
    2. It has less bug ( I have not seen any )
    3. It is smaller in bin file ( I mean executable file after compiling )
    4. It is faster then std::regex

  • 使用下面的 gcc版本7.1.0 .我发现的最后一个错误是版本 6.3.0

  • use gcc version 7.1.0 and NOT below. The last bug I found is in version 6.3.0


    如果您诱使(=说服),请使用

    1. 正则表达式的问题 链接 用于大型任务的库: std.regex 以及原因:

    1. Use d regular expression link library for large task: std.regex and why:

    1. 快速 更快的命令D
    2. 中的线条工具
    3. 轻松
    4. 灵活的 drn

  • 使用本机 pcre pcre2 ="tag"> c

    • 速度极快,但有点复杂

    这篇关于std regex_search只匹配当前行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆