使用QJsonDocument将子字符串解析为JSON [英] Parse a substring as JSON using QJsonDocument

查看：483 发布时间：2016/10/14 20:32:55 c++ json qt parsing qt5

本文介绍了使用QJsonDocument将子字符串解析为JSON的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个包含（不是）JSON编码数据的字符串，如下例所示：

I have a string which contains (not is) JSON-encoded data, like in this example:

foo([1, 2, 3], "some more stuff")
    |        |
  start     end   (of JSON-encoded data)

我们在应用程序中使用的完整语言嵌套JSON编码的数据，而其余的语言是微不足道的（只是递归的东西）。当在递归解析器中从左到右解析字符串时，我知道当我遇到一个JSON编码的值，就像这里 [1，2，3] index 4.在解析这个子字符串之后，我需要知道结束位置以继续解析字符串的其余部分。

The complete language we use in our application nests JSON-encoded data, while the rest of the language is trivial (just recursive stuff). When parsing strings like this from left to right in a recursive parser, I know when I encounter a JSON-encoded value, like here the [1, 2, 3] starting at index 4. After parsing this substring, I need to know the end position to continue parsing the rest of the string.

我想将这个子字符串传递给一个井测试的JSON解析器，如Qt5中的 QJsonDocument 。但是，阅读文档时，不可能只解析一个子字符串作为JSON，意味着一旦解析的数据结束（在消费] 后）控制返回，而不报告解析错误。此外，我需要知道结束位置继续解析我自己的东西（这里剩余的字符串，一些更多的东西））。

I'd like to pass this substring to a well-tested JSON-parser like QJsonDocument in Qt5. But as reading the documentation, there is no possibility to parse only a substring as JSON, meaning that as soon as the parsed data ends (after consuming the ] here) control returns without reporting a parse error. Also, I need to know the end position to continue parsing my own stuff (here the remaining string is , "some more stuff")).

为了做到这一点，我曾经使用一个自定义的JSON解析器，它通过引用获取当前位置，并在完成解析后更新它。但是，由于它是业务应用程序的安全关键部分，我们不想再坚持我的自制解析器了。我的意思是有 QJsonDocument ，所以为什么不使用它。（我们已经使用Qt5了。）

To do this, I used to use a custom JSON parser which takes the current position by reference and updates it after finishing parsing. But since it's a security-critical part of a business application, we don't want to stick to my self-crafted parser anymore. I mean there is QJsonDocument, so why not use it. (We already use Qt5.)

作为一种解决方法，我想到这种方法：

As a work-around, I'm thinking of this approach:

让 QJsonDocument 解析从当前位置开始的子字符串（这是无效的JSON）

错误报告了一个意想不到的字符，这是超越JSON的一些位置

让 QJsonDocument 再次解析，结束位置

Let QJsonDocument parse the substring starting from the current position (which is no valid JSON)
The error reports an unexpected character, this is some position beyond the JSON
Let QJsonDocument parse again, but this time the substring with the correct end position

第二个想法是写一个JSON结束扫描器，它接受整个字符串，开始位置并返回结束位置JSON编码数据的位置。这也需要解析，因为不匹配的括号/括号可以出现在字符串值中，但是与完全手工制作的JSON解析器相比，写入（和使用）这样的类应该容易得多（更安全）。

A second idea is to write a "JSON end scanner" which takes the whole string, a start position and returns the end position of the JSON-encoded data. This also requires parsing, as unmatched brackets / parentheses can appear in string values, but it should be much easier (and safer) to write (and use) such a class in comparison to a fully hand-crafted JSON-parser.

有人有更好的主意吗？

推荐答案

*]，具体取决于 http://www.ietf.org/rfc/rfc4627.txt 使用精神。

I rolled a quick parser[*] based on http://www.ietf.org/rfc/rfc4627.txt using Spirit Qi.

它实际上不会解析成AST，但它会解析所有的JSON有效负载，这实际上是一个比这里需要的更多。

It doesn't actually parse into an AST, but it parses all of the JSON payload, which is actually a bit more than required here.

范例 这里（http://liveworkspace.org/code/3k4Yor$2） 输出：

The sample here (http://liveworkspace.org/code/3k4Yor$2) outputs:

Non-JSON part of input starts after valid JSON: ', "some more stuff")'

输入的非JSON部分在有效的JSON后面启动：基于OP给出的测试：

Based on the test given by the OP:

const std::string input("foo([1, 2, 3], \"some more stuff\")");

// set to start of JSON
auto f(begin(input)), l(end(input));
std::advance(f, 4);

bool ok = doParse(f, l); // updates f to point after the start of valid JSON

if (ok) 
    std::cout << "Non-JSON part of input starts after valid JSON: '" << std::string(f, l) << "'\n";

我已经测试了一些其他更多涉及的JSON文档（包括多行）。

I have tested with several other more involved JSON documents (including multiline).

几句话：

我做了基于迭代器的解析器，所以它很可能很容易与Qt字符串（？）

如果要禁止多行片段，请将 qi :: space 的船长更改为 qi :: blank

有关数字解析（见TODO）的一致性快捷方式不会影响此答案的有效性。

I made the parser Iterator-based so it will likely easily work with Qt strings(?)
If you want to disallow multi-line fragments, change the skipper from qi::space to qi::blank
There is a conformance shortcut regarding number parsing (see TODO) that doesn't affect validity for this answer (see comment).

[*]从技术上讲，这更像是一个解析器别的。它基本上是一个lexer承担太多的工作：）

[*] technically, this is more of a parser stub since it doesn't translate into something else. It is basically a lexer taking on too much work :)

// #define BOOST_SPIRIT_DEBUG #include <boost/spirit/include/qi.hpp> namespace qi = boost::spirit::qi; template <typename It, typename Skipper = qi::space_type> struct parser : qi::grammar<It, Skipper> { parser() : parser::base_type(json) { // 2.1 values value = qi::lit("false") | "null" | "true" | object | array | number | string; // 2.2 objects object = '{' >> -(member % ',') >> '}'; member = string >> ':' >> value; // 2.3 Arrays array = '[' >> -(value % ',') >> ']'; // 2.4. Numbers // Note out spirit grammar takes a shortcut, as the RFC specification is more restrictive: // // However non of the above affect any structure characters (:,{}[] and double quotes) so it doesn't // matter for the current purpose. For full compliance, this remains TODO: // // Numeric values that cannot be represented as sequences of digits // (such as Infinity and NaN) are not permitted. // number = [ minus ] int [ frac ] [ exp ] // decimal-point = %x2E ; . // digit1-9 = %x31-39 ; 1-9 // e = %x65 / %x45 ; e E // exp = e [ minus / plus ] 1*DIGIT // frac = decimal-point 1*DIGIT // int = zero / ( digit1-9 *DIGIT ) // minus = %x2D ; - // plus = %x2B ; + // zero = %x30 ; 0 number = qi::double_; // shortcut :) // 2.5 Strings string = qi::lexeme [ '"' >> *char_ >> '"' ]; static const qi::uint_parser<uint32_t, 16, 4, 4> _4HEXDIG; char_ = ~qi::char_("\"\\") | qi::char_("\x5C") >> ( // \ (reverse solidus) qi::char_("\x22") | // " quotation mark U+0022 qi::char_("\x5C") | // \ reverse solidus U+005C qi::char_("\x2F") | // / solidus U+002F qi::char_("\x62") | // b backspace U+0008 qi::char_("\x66") | // f form feed U+000C qi::char_("\x6E") | // n line feed U+000A qi::char_("\x72") | // r carriage return U+000D qi::char_("\x74") | // t tab U+0009 qi::char_("\x75") >> _4HEXDIG ) // uXXXX U+XXXX ; // entry point json = value; BOOST_SPIRIT_DEBUG_NODES( (json)(value)(object)(member)(array)(number)(string)(char_)); } private: qi::rule<It, Skipper> json, value, object, member, array, number, string; qi::rule<It> char_; }; template <typename It> bool tryParseAsJson(It& f, It l) // note: first iterator gets updated { static const parser<It, qi::space_type> p; try { return qi::phrase_parse(f,l,p,qi::space); } catch(const qi::expectation_failure<It>& e) { // expectation points not currently used, but we could tidy up the grammar to bail on unexpected tokens std::string frag(e.first, e.last); std::cerr << e.what() << "'" << frag << "'\n"; return false; } } int main() { #if 0 // read full stdin std::cin.unsetf(std::ios::skipws); std::istream_iterator<char> it(std::cin), pte; const std::string input(it, pte); // set up parse iterators auto f(begin(input)), l(end(input)); #else const std::string input("foo([1, 2, 3], \"some more stuff\")"); // set to start of JSON auto f(begin(input)), l(end(input)); std::advance(f, 4); #endif bool ok = tryParseAsJson(f, l); // updates f to point after the end of valid JSON if (ok) std::cout << "Non-JSON part of input starts after valid JSON: '" << std::string(f, l) << "'\n"; return ok? 0 : 255; }

这篇关于使用QJsonDocument将子字符串解析为JSON的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用QJsonDocument将子字符串解析为JSON [英] Parse a substring as JSON using QJsonDocument

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

使用QJsonDocument将子字符串解析为JSON [英] Parse a substring as JSON using QJsonDocument

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭