C ++中的正则表达式字符类减法 [英] Regex character class subtraction in C++

查看:73
本文介绍了C ++中的正则表达式字符类减法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个C ++程序,它将需要采用XML Schema文件中定义的正则表达式,并使用它们来验证XML数据.问题是,C ++似乎不直接支持XML模式使用的正则表达式.

I'm writing a C++ program that will need to take regular expressions that are defined in a XML Schema file and use them to validate XML data. The problem is, the flavor of regular expressions used by XML Schemas does not seem to be directly supported in C++.

例如,有一些特殊字符类 \ i \ c 默认情况下未定义,并且XML Schema regex语言也支持称为"的特殊字符类.字符类减法"似乎在C ++中不受支持.

For example, there are a couple special character classes \i and \c that are not defined by default and also the XML Schema regex language supports something called "character class subtraction" that does not seem to be supported in C++.

允许使用 \ i \ c 特殊字符类非常简单,我只需要查找"\ i"即可.或"\ c"在正则表达式中并用其扩展版本替换它们,但是使字符类减法起作用是一个更加艰巨的问题...

Allowing the use of the \i and \c special character classes is pretty simple, I can just look for "\i" or "\c" in the regular expression and replace them with their expanded versions, but getting character class subtraction to work is a much more daunting problem...

例如,此在XML Schema定义中有效的正则表达式会在C ++中引发异常,表明它具有不平衡的方括号.

For example, this regular expression that is valid in an XML Schema definition throws an exception in C++ saying it has unbalanced square brackets.

#include <iostream>
#include <regex>

int main()
{
    try
    {
        // Match any lowercase letter that is not a vowel
        std::regex rx("[a-z-[aeiuo]]");
    }
    catch (const std::regex_error& ex)
    {
        std::cout << ex.what() << std::endl;
    }
}

如何获取C ++来识别正则表达式中的字符类减法?甚至更好,是否有一种方法可以直接在C ++中直接使用XML Schema正则表达式?

How can I get C++ to recognize character class subtraction within a regex? Or even better, is there a way to just use the XML Schema flavor of regular expressions directly within C++?

推荐答案

好吧,在经历了其他答案之后,我尝试了一些不同的方法,最终使用了 libxml2中的 xmlRegexp 功能..

Okay after going through the other answers I tried out a few different things and ended up using the xmlRegexp functionality from libxml2.

xmlRegexp 相关的函数的文档非常少,所以我认为我将在此处发布示例,因为其他人可能会发现它有用:

The xmlRegexp related functions are very poorly documented so I figured I would post an example here because others may find it useful:

#include <iostream>
#include <libxml/xmlregexp.h>

int main()
{
    LIBXML_TEST_VERSION;

    xmlChar* str = xmlCharStrdup("bcdfg");
    xmlChar* pattern = xmlCharStrdup("[a-z-[aeiou]]+");
    xmlRegexp* regex = xmlRegexpCompile(pattern);

    if (xmlRegexpExec(regex, str) == 1)
    {
        std::cout << "Match!" << std::endl;
    }

    free(regex);
    free(pattern);
    free(str);
}

输出:

匹配!

我还尝试使用 Xerces-C ++ 库中的 XMLString :: patternMatch ,但是它似乎没有在下面使用与XML Schema兼容的正则表达式引擎.(老实说,我不知道它在下面使用什么正则表达式引擎,并且该文档的文档非常糟糕,我无法在线找到任何示例,因此我放弃了.)

I also attempted to use the XMLString::patternMatch from the Xerces-C++ library but it didn't seem to use an XML Schema compliant regex engine underneath. (Honestly I have no clue what regex engine it uses underneath and the documentation for that was pretty abysmal and I couldn't find any examples online so I just gave up on it.)

这篇关于C ++中的正则表达式字符类减法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆