从 Ruby 解析特定的类 JSON 数据 (NextSTEP PList) [英] Parsing specific JSON-like data (NextSTEP PList) from Ruby

查看:67
本文介绍了从 Ruby 解析特定的类 JSON 数据 (NextSTEP PList)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为第三方 API 编写客户端,它们以一种奇怪的格式提供数据.起初,它可能看起来像 JSON,但它不是,我对如何处理它感到有些困惑.

i'm writing a client to a third-party API, and they provide data in a weird format. At first, it might look like JSON but it's not, and i'm a bit confused about how i should handle that.

这是一种基于键值的格式(很像 JSON).

It's a key-value based format (much like JSON).

  • 键与其值之间用="分隔.
  • 键和值用双引号括起来.
  • 字典以{"开头,以}"结尾.
  • 数组以 '(' 开头并以')'结尾
  • 以;"结尾的行(数组内容除外)和行尾字符(我认为是 \r).
  • 有时,字符串中似乎有 unicode(比如 \U2623 表示 BioHazard 标志).

这种格式可能是什么?我应该使用预制的 gem 来解析它,还是应该构建自己的解析器?

What could possibly be this format? Shall i use a premade gem to parse it, or should i build my own parser?

{ "anArray" = (
  "100",
  "200",
  "300"
  );
  "aDictionary" = {
    "aString" = "Something";
  };
}

EDIT 这种格式似乎是 Apple 的属性列表,但它既不是 XML 也不是二进制...这是有道理的,因为 API 来自 WebObjects 网络服务.我会尝试使用 CFPropertyList gem 来解析它,如果有更好的解决方案,请告诉我.

EDIT This format seems to be Apple's property list, but it's not XML neither Binary... This make sense as the API is from a WebObjects webservice. i will try to use CFPropertyList gem to parse it, if there is a better solution, please let me know.

编辑 2 这是一个 NextSTEP 属性列表.>

EDIT 2 This is a NextSTEP Property List.

推荐答案

这是使用自定义 StringScanner 基于解析器.它允许空格是可选的,允许在列表中的最后一项之后使用尾随逗号,并允许在最后一个字典键/值对之后省略分号.它允许最外面的项目是字典、数组或字符串.它允许任何类型的合法字符串内容,包括括号和花括号以及像 \n 这样的转义文本.

Here's a robust answer using a custom StringScanner-based parser. It allows whitespace to be optional, allows trailing commas after the last item in a list and allows omitting the semicolon after the last dictionary key/value pair. It allows the outermost item to be an dictionary, array, or string. And it allows really any sort of legal string content, including parens and curly braces and escaped text like \n.

见于行动:

p parse('{ "array" = ( "1", "2", ( "3", "4" ) ); "hash"={ "key"={ "more"="oh}]yes;!"; }; }; }')
#=> {"array"=>["1", "2", ["3", "4"]], "hash"=>{"key"=>{"more"=>"oh}]yes;!"}}}

puts parse('("Escaped \"Quotes\" Allowed", "And Unicode \u2623 OK")')
#=> Escaped "Quotes" Allowed
#=> And Unicode ☣ OK

代码:

require 'strscan'
def parse(str)
  ss, getstr, getary, getdct = StringScanner.new(str)
  getvalue = ->{
    if    ss.scan /\s*\{\s*/   then getdct[]
    elsif ss.scan /\s*\(\s*/   then getary[]
    elsif str = getstr[]       then str
    elsif ss.scan /\s*[)}]\s*/ then nil end
  }
  getstr = ->{
    if str=ss.scan(/\s*"(?:[^"\\]|\\u\d+|\\.)*"\s*/i)
      eval str.gsub(/([^\\](?:\\\\)*)#(?=[{@$])/,'\1\#')
    end
  }
  getary = ->{
    [].tap do |a|
      while v=getvalue[]
        a << v
        ss.scan /\s*,\s*/
      end
    end
  }
  getdct = ->{
    {}.tap do |h|
      while key = getstr[]
        ss.scan /\s*=\s*/
        if value=getvalue[] then h[key]=value; ss.scan(/\s*;\s*/) end
        end
      end
    end
  }
  getvalue[]
end

作为将来从头开始滚动您自己的解析器的替代方法,您可能还想查看 Treetop Ruby 库.

As an alternative to rolling your own parser from scratch in the future, you might also want to look into the Treetop Ruby library.

编辑:我已经将上面的 getstr 实现替换为一个可以防止在 eval 中运行任意 Ruby 代码的实现.有关更多详细信息,请参阅评估没有插值的字符串".见于行动:

Edit: I've replaced the implementation of getstr above with one that should prevent running arbitrary Ruby code inside the eval. For more details, see "Eval a string without interpolation". Seen in action:

@secret = "OH NO!"
$secret = "OH NO!"
@@secret = "OH NO!"
puts parse('"\"#{:NOT&&:very}\" bad. \u262E\n#@secret \\#$secret \\\\#@@secret"')

这篇关于从 Ruby 解析特定的类 JSON 数据 (NextSTEP PList)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆