最有效的方式来逃避XML / HTML的C ++字符串？ [英] Most efficient way to escape XML/HTML in C++ string?

查看：204 发布时间：2015/11/30 14:58:55 c++ algorithm string stl

本文介绍了最有效的方式来逃避XML / HTML的C ++字符串？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我无法相信这个问题一直没有问过。予有需要被插入到HTML文件中的字符串，但它可以含有特殊的HTML字符。我想用适当的HTML再presentation替换这些。

以下作品中的code，但为pretty的冗长和丑陋。性能是不是我的应用程序的关键，但我想有可扩展性问题在这里也。如何提高呢？我想这是STL的算法或者一些深奥的升压功能的工作，但低于code是最好的，我可以拿出自己。

 无效逃生（标准::字符串*数据）
{
    标准::字符串:: size_type的POS = 0;
    对于 （;;）
    {
        POS =数据 - ＆GT; find_first_of（\＆放大器;＆LT;＆gt;中，POS）;
        如果（POS ==标准::字符串::非营利机构）破;
        性病::字符串替换;
        开关（（*数据）[POS]）
        {
        案\'：更换=＆放大器; QUOT;;打破;
        案件'和;'：更换=＆放大器;放大器;;打破;
        案'＆LT;'：更换=＆放大器; LT;;打破;
        案'＆GT;'：更换=＆放大器; GT;;打破;
        默认： ;
        }
        数据 - ＆GT;更换（POS，1，更换）;
        POS + = replacement.size（）;
    };
}

解决方案

而不是仅仅更换了原来的字符串，你可以复制与即时更换它避免了移动字符的字符串研究。这将有更好的复杂性和缓存行为，所以我期待一个巨大的进步。或者你可以使用<一个href="http://www.tena-sda.org/doc/5.2.1/boost/d3/df1/namespaceboost_1_1spirit_1_1xml.html">boost::spirit::xml EN code 或 HTTP：//$c$c.google.com/p/pugixml / 。

 无效连接code（标准::字符串和放大器;数据）{
    性病::字符串缓冲区;
    buffer.reserve（data.size（））;
    用于（为size_t POS = 0;！POS = data.size（）; ++ POS）{
        开关（数据[POS]）{
            案件'和;'：buffer.append（＆放大器;放大器;）;打破;
            案\'：buffer.append（＆放大器; QUOT;）;打破;
            案例'\''：buffer.append（＆放大器;者;）;打破;
            案'＆LT;'：buffer.append（＆放大器; LT;）;打破;
            案'＆GT;'：buffer.append（与＆amp; gt;中）;打破;
            默认：buffer.append（安培;数据[POS]，1）;打破;
        }
    }
    data.swap（缓冲液）;
}

编辑：小的提升，可以通过使用启发式来确定缓冲区的大小来实现的。替换为 buffer.reserve 行 data.size（）* 1.1 （10％）或类似的东西取决于如何很多替代品的预期。

I can't believe this question hasn't been asked before. I have a string that needs to be inserted into an HTML file but it may contain special HTML characters. I want to replace these with the appropriate HTML representation.

The code below works but is pretty verbose and ugly. Performance is not critical for my application but I guess there are scalability problems here also. How can I improve this? I guess this is a job for STL algorithms or some esoteric Boost function, but the code below is the best I can come up with myself.

void escape(std::string *data)
{
    std::string::size_type pos = 0;
    for (;;)
    {
        pos = data->find_first_of("\"&<>", pos);
        if (pos == std::string::npos) break;
        std::string replacement;
        switch ((*data)[pos])
        {
        case '\"': replacement = "&quot;"; break;   
        case '&':  replacement = "&amp;";  break;   
        case '<':  replacement = "&lt;";   break;   
        case '>':  replacement = "&gt;";   break;   
        default: ;
        }
        data->replace(pos, 1, replacement);
        pos += replacement.size();
    };
}

解决方案

Instead of just replacing in the original string, you can do copying with on-the-fly replacement which avoids having to move characters in the string. This will have much better complexity and cache behavior, so I'd expect a huge improvement. Or you can use boost::spirit::xml encode or http://code.google.com/p/pugixml/.

void encode(std::string& data) {
    std::string buffer;
    buffer.reserve(data.size());
    for(size_t pos = 0; pos != data.size(); ++pos) {
        switch(data[pos]) {
            case '&':  buffer.append("&amp;");       break;
            case '\"': buffer.append("&quot;");      break;
            case '\'': buffer.append("&apos;");      break;
            case '<':  buffer.append("&lt;");        break;
            case '>':  buffer.append("&gt;");        break;
            default:   buffer.append(&data[pos], 1); break;
        }
    }
    data.swap(buffer);
}

EDIT: A small improvement can be achieved by using an heuristic to determine the size of the buffer. Replace the buffer.reserve line with data.size()*1.1 (10%) or something similar depending of how much replacements are expected.

这篇关于最有效的方式来逃避XML / HTML的C ++字符串？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

最有效的方式来逃避XML / HTML的C ++字符串？ [英] Most efficient way to escape XML/HTML in C++ string?

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

最有效的方式来逃避XML / HTML的C ++字符串？ [英] Most efficient way to escape XML/HTML in C++ string?

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭