这是哪种数据格式(用冒号和分号分隔的条目)? [英] What kind of data format is this (colon and semicolon separated entries)?

查看:49
本文介绍了这是哪种数据格式(用冒号和分号分隔的条目)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在集成两个文档不良的系统,在此过程中,我遇到了一种以前从未见过的奇怪的数据格式.它以纯文本格式存储在db中,没有指示格式是什么以及如何处理.

  a:17:{s:2:"id"; s:27:"145219921F990B11C39E7220000"; s:16:"purchase_country"; s:2:"no"; s:17:"purchase_currency"; s:3:"nok"; s:6:"locale"; s:5:"nb-no"; s:6:"status"; s:17:"checkout_complete"; s:9:"reference"; s:27:"145212221F990B11C39E7221000"; s:11:预订"; s:10:"2348226550"; s:10:"started_at"; s:25:"2014-04-04T10:40:55 + 02:00; s:12:" completed_at; s:25:" 2014-04-02T10:41:11 + 02:00; s:16:" last_modified_at; s:25:" 2014-04-02T10:41:11 + 02:00; s:10:" expires_at; s:25:" 2014-04-16T10:41:11 + 02:00; s:4:"购物车; a:4:{s:25:总价不包括税"; i:489500; s:16:总税额"; i:0; s:25:总价_含税""i:489500; s:5:"商品; a:2:{i:0; a:10:{s:9:参考"; s:2:"68"; s:4:名称"; s:21:"1.OSO SUPER S 200LIT."; s:8:数量"; i:1; s:10:单价"; i:695000; s:8:税率"; i:0; s:13:折扣率"; i:0; s:4:类型; s:8:"物理; s:25:"总价_含税; i:695500; s:25:"总价_不含税; i:694000; s:16:"总税额; i:0;} i:1; a:10:{s:9:引用"; s:2:"68"; s:4:名称"; s:32:"1.OSO S超级200 LIT(折扣); s:8:"数量; i:1; s:10:"单位价格; i:-205100; s:8:"税率; i:0; s:13:"折扣率;i:0; s:4:类型"; s:8:物理"; s:25:总价包括税"; i:-205100; s:25:总价不包括税"; i:-205100; s:16:"total_tax_amount"; i:0;}}} s:8:客户"; a:1:{s:4:类型"; s:6:人";} s:16:发货地址"; a:8:{s:10:"given_name"; s:13:"Testperson-no"; s:11:"family_name"; s:8:"Approved"; s:14:"street_address"; s:18:Sæffleberggate56"; s:11:邮政编码"; s:4:"0563"; s:4:城市"; s:4:"OSLO"; s:7:国家"; s:2:"no; s:5:"电子邮件; s:32:" omitted@testdrive.klarna.com; s:5:"电话; s:11:" 40 12 34 56;} s:15:"billing_address; a:8:{s:10:" given_name; s:13:" Testperson-no; s:11:" family_name; s:8:" Approved; s:14:" street_address;s:18:Sæffleberggate56"; s:11:邮政编码"; s:4:"0563"; s:4:城市"; s:4:"OSLO"; s:7:国家"; s:2:否"; s:5:电子邮件"; s:32:"checkout-no@testdrive.klarna.com"; s:5:电话"; s:11:"40 12 34 56";} s:7:选项"; a:1:{s:31:"allow_separate_shipping_address"; b:0;} s:8:商人"; a:5:{s:2:"id"; s:4:"1601; s:9:" terms_uri; s:95:"省略; s:12:" checkout_uri; s:59:"省略; s:16:" confirmation_uri; s:220:"省略"; s:8:"push_uri"; s:229:省略";}} 

一个条目由冒号分隔的段组成:

  • 单个char类型的标记(数组,对象,int,十进制,布尔值,字符串)
  • 一个数字,表示值在字符,字节,元素(对于数组)或键值对(对于objs)中的长度,考虑到这是需要我的文本格式,这似乎完全没有用无论如何都要解析长度段.对于整数和小数不存在.
  • 字段值
  • 键值对似乎是偶数个元素的平面列表.他们似乎也将数组用作对象(请参见示例).
  • 一个; 终结符,似乎对于对象和数组不是必需的,只是使解析变得更加乏味.

现在,解析这个东西是相当容易的,但是我一直对新的数据类型及其奇怪的语法感到惊讶,而且我不确定我是否已经用我的一些数据样本涵盖了所有的极端情况分析.有人熟悉这种格式吗?

解决方案

类似于PHP序列化.请参阅: http://www.phpinternalsbook.com/classes_objects/serialization.html

I'm integrating two poorly documented systems, and in the process I've come across a strange data format I haven't seen before. It's stored as plain text in the db with no indication as to what the format is and how to deal with it.

a:17:{s:2:"id";s:27:"145219921F990B11C39E7220000";s:16:"purchase_country";s:2:"no";s:17:"purchase_currency";s:3:"nok";s:6:"locale";s:5:"nb-no";s:6:"status";s:17:"checkout_complete";s:9:"reference";s:27:"145212221F990B11C39E7221000";s:11:"reservation";s:10:"2348226550";s:10:"started_at";s:25:"2014-04-04T10:40:55+02:00";s:12:"completed_at";s:25:"2014-04-02T10:41:11+02:00";s:16:"last_modified_at";s:25:"2014-04-02T10:41:11+02:00";s:10:"expires_at";s:25:"2014-04-16T10:41:11+02:00";s:4:"cart";a:4:{s:25:"total_price_excluding_tax";i:489500;s:16:"total_tax_amount";i:0;s:25:"total_price_including_tax";i:489500;s:5:"items";a:2:{i:0;a:10:{s:9:"reference";s:2:"68";s:4:"name";s:21:"1.OSO SUPER S 200LIT.";s:8:"quantity";i:1;s:10:"unit_price";i:695000;s:8:"tax_rate";i:0;s:13:"discount_rate";i:0;s:4:"type";s:8:"physical";s:25:"total_price_including_tax";i:695500;s:25:"total_price_excluding_tax";i:694000;s:16:"total_tax_amount";i:0;}i:1;a:10:{s:9:"reference";s:2:"68";s:4:"name";s:32:"1.OSO SUPER S 200LIT. (discount)";s:8:"quantity";i:1;s:10:"unit_price";i:-205100;s:8:"tax_rate";i:0;s:13:"discount_rate";i:0;s:4:"type";s:8:"physical";s:25:"total_price_including_tax";i:-205100;s:25:"total_price_excluding_tax";i:-205100;s:16:"total_tax_amount";i:0;}}}s:8:"customer";a:1:{s:4:"type";s:6:"person";}s:16:"shipping_address";a:8:{s:10:"given_name";s:13:"Testperson-no";s:11:"family_name";s:8:"Approved";s:14:"street_address";s:18:"Sæffleberggate 56";s:11:"postal_code";s:4:"0563";s:4:"city";s:4:"OSLO";s:7:"country";s:2:"no";s:5:"email";s:32:"omitted@testdrive.klarna.com";s:5:"phone";s:11:"40 12 34 56";}s:15:"billing_address";a:8:{s:10:"given_name";s:13:"Testperson-no";s:11:"family_name";s:8:"Approved";s:14:"street_address";s:18:"Sæffleberggate 56";s:11:"postal_code";s:4:"0563";s:4:"city";s:4:"OSLO";s:7:"country";s:2:"no";s:5:"email";s:32:"checkout-no@testdrive.klarna.com";s:5:"phone";s:11:"40 12 34 56";}s:7:"options";a:1:{s:31:"allow_separate_shipping_address";b:0;}s:8:"merchant";a:5:{s:2:"id";s:4:"1601";s:9:"terms_uri";s:95:"omitted";s:12:"checkout_uri";s:59:"omitted";s:16:"confirmation_uri";s:220:"omitted";s:8:"push_uri";s:229:"omitted";}} 

An entry consists of colon-separated segments:

  • A single char type tag (array, object, int, decimal, bool, string)
  • A number that says how long the value is in characters, bytes, elements (in case of arrays) or key-value pairs (in case of objs), which seems completely useless given that this is a textual format that requires me to parse the length segment anyway. This isn't present for integers and decimals.
  • Value of the field
  • Key-value pairs seem to be a flat list of an even number of elements. They also seem to be using arrays as objects as well (see example).
  • A ; terminator, which seems not to be necessary for objects and arrays, just to make parsing more tedious.

Now, parsing this thing is reasonably easy, but I'm constantly being surprised by new data types and their weird syntax and I'm not sure that I've covered all the edge cases with the few data samples I've analyzed. Is anyone familiar with this format?

解决方案

Looks like PHP serialization. See: http://www.phpinternalsbook.com/classes_objects/serialization.html

这篇关于这是哪种数据格式(用冒号和分号分隔的条目)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆