这是什么样的数据格式(冒号和分号分隔的条目)? [英] What kind of data format is this (colon and semicolon separated entries)?

查看:13
本文介绍了这是什么样的数据格式(冒号和分号分隔的条目)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在集成两个文档记录不佳的系统,在此过程中,我遇到了一种我以前从未见过的奇怪数据格式.它以纯文本形式存储在数据库中,没有说明格式是什么以及如何处理它.

a:17:{s:2:"id";s:27:"145219921F990B11C39E7220000";s:16:"purchase_country";s:2:"no";s:17:"purchase_currency";s:3:"nok";s:6:"locale";s:5:"nb-no";s:6:"status";s:17:"checkout_complete";s:9:"reference";s:27:"145212221F990B11C39E7221000";s:11:"保留";s:10:"2348226550";s:10:"started_at";s:25:"2014-04-04T10:40:55+02:00";s:12:"completed_at";s:25:"2014-04-02T10:41:11+02:00";s:16:"last_modified_at";s:25:"2014-04-02T10:41:11+02:00";s:10:"expires_at";s:25:"2014-04-16T10:41:11+02:00";s:4:"购物车";a:4:{s:25:"total_price_ exclude_tax";i:489500;s:16:"total_tax_amount";i:0;s:25:"total_price_including_tax";i:489500;s:5:"items";a:2:{i:0;a:10:{s:9:"reference";s:2:"68";s:4:"name";s:21:"1.OSO SUPER S 200LIT.";s:8:"数量";i:1;s:10:"unit_price";i:695000;s:8:"tax_rate";i:0;s:13:"discount_rate";i:0;s:4:"类型";s:8:"物理";s:25:"total_price_including_tax";i:695500;s:25:"total_price_ exclude_tax";i:694000;s:16:"total_tax_amount";i:0;}i:1;a:10:{s:9:"参考";s:2:"68";s:4:"名称";s:32:"1.OSO SUPER S 200LIT.(折扣)";s:8:"数量";i:1;s:10:"unit_price";i:-205100;s:8:"tax_rate";i:0;s:13:"discount_rate";i:0;s:4:"类型";s:8:"物理";s:25:"total_price_include_tax";i:-205100;s:25:"total_price_ exclude_tax";i:-205100;s:16:"total_tax_amount";i:0;}}}s:8:"customer";a:1:{s:4:"type";s:6:"person";}s:16:"shipping_address";a:8:{s:10:"given_name";s:13:"Testperson-no";s:11:"family_name";s:8:"已批准";s:14:"street_address";s:18:"Sæffleberggate 56";s:11:"postal_code";s:4:"0563";s:4:"city";s:4:"OSLO";s:7:"country";s:2:"没有";s:5:"电子邮件";s:32:"省略@testdrive.klarna.com";s:5:"电话";s:11:"40 12 34 56";}s:15:"billing_address";a:8:{s:10:"given_name";s:13:"Testperson-no";s:11:"family_name";s:8:"已批准";s:14:"street_address";s:18:"Sæffleberggate 56";s:11:"postal_code";s:4:"0563";s:4:"city";s:4:"OSLO";s:7:"country";s:2:"没有";s:5:"email";s:32:"checkout-no@testdrive.klarna.com";s:5:"电话";s:11:"40 12 34 56";}s:7:"options";a:1:{s:31:"allow_separate_shipping_address";b:0;}s:8:"merchant";a:5:{s:2:"id";s:4:1601";s:9:"terms_uri";s:95:"省略";s:12:"checkout_uri";s:59:"省略";s:16:"confirmation_uri";s:220:"省略";s:8:"push_uri";s:229:"省略";}}

一个条目由冒号分隔的段组成:

  • 单个 char 类型标签(数组、对象、int、decimal、bool、string)
  • 表示值在字符、字节、元素(在数组的情况下)或键值对(在 obj 的情况下)中的长度的数字,考虑到这是需要我的文本格式,这似乎完全没用无论如何都要解析长度段.这不适用于整数和小数.
  • 字段值
  • 键值对似乎是偶数个元素的平面列表.他们似乎也将数组用作对象(参见示例).
  • 一个 ; 终止符,对象和数组似乎不需要,只是为了让解析更加繁琐.

现在,解析这个东西相当容易,但我经常对新的数据类型及其奇怪的语法感到惊讶,我不确定我是否已经用我所拥有的少数数据样本涵盖了所有边缘情况分析了.有人熟悉这种格式吗?

解决方案

看起来像 PHP 序列化.请参阅:http://www.phpinternalsbook.com/classes_objects/serialization.htmlp>

I'm integrating two poorly documented systems, and in the process I've come across a strange data format I haven't seen before. It's stored as plain text in the db with no indication as to what the format is and how to deal with it.

a:17:{s:2:"id";s:27:"145219921F990B11C39E7220000";s:16:"purchase_country";s:2:"no";s:17:"purchase_currency";s:3:"nok";s:6:"locale";s:5:"nb-no";s:6:"status";s:17:"checkout_complete";s:9:"reference";s:27:"145212221F990B11C39E7221000";s:11:"reservation";s:10:"2348226550";s:10:"started_at";s:25:"2014-04-04T10:40:55+02:00";s:12:"completed_at";s:25:"2014-04-02T10:41:11+02:00";s:16:"last_modified_at";s:25:"2014-04-02T10:41:11+02:00";s:10:"expires_at";s:25:"2014-04-16T10:41:11+02:00";s:4:"cart";a:4:{s:25:"total_price_excluding_tax";i:489500;s:16:"total_tax_amount";i:0;s:25:"total_price_including_tax";i:489500;s:5:"items";a:2:{i:0;a:10:{s:9:"reference";s:2:"68";s:4:"name";s:21:"1.OSO SUPER S 200LIT.";s:8:"quantity";i:1;s:10:"unit_price";i:695000;s:8:"tax_rate";i:0;s:13:"discount_rate";i:0;s:4:"type";s:8:"physical";s:25:"total_price_including_tax";i:695500;s:25:"total_price_excluding_tax";i:694000;s:16:"total_tax_amount";i:0;}i:1;a:10:{s:9:"reference";s:2:"68";s:4:"name";s:32:"1.OSO SUPER S 200LIT. (discount)";s:8:"quantity";i:1;s:10:"unit_price";i:-205100;s:8:"tax_rate";i:0;s:13:"discount_rate";i:0;s:4:"type";s:8:"physical";s:25:"total_price_including_tax";i:-205100;s:25:"total_price_excluding_tax";i:-205100;s:16:"total_tax_amount";i:0;}}}s:8:"customer";a:1:{s:4:"type";s:6:"person";}s:16:"shipping_address";a:8:{s:10:"given_name";s:13:"Testperson-no";s:11:"family_name";s:8:"Approved";s:14:"street_address";s:18:"Sæffleberggate 56";s:11:"postal_code";s:4:"0563";s:4:"city";s:4:"OSLO";s:7:"country";s:2:"no";s:5:"email";s:32:"omitted@testdrive.klarna.com";s:5:"phone";s:11:"40 12 34 56";}s:15:"billing_address";a:8:{s:10:"given_name";s:13:"Testperson-no";s:11:"family_name";s:8:"Approved";s:14:"street_address";s:18:"Sæffleberggate 56";s:11:"postal_code";s:4:"0563";s:4:"city";s:4:"OSLO";s:7:"country";s:2:"no";s:5:"email";s:32:"checkout-no@testdrive.klarna.com";s:5:"phone";s:11:"40 12 34 56";}s:7:"options";a:1:{s:31:"allow_separate_shipping_address";b:0;}s:8:"merchant";a:5:{s:2:"id";s:4:"1601";s:9:"terms_uri";s:95:"omitted";s:12:"checkout_uri";s:59:"omitted";s:16:"confirmation_uri";s:220:"omitted";s:8:"push_uri";s:229:"omitted";}} 

An entry consists of colon-separated segments:

  • A single char type tag (array, object, int, decimal, bool, string)
  • A number that says how long the value is in characters, bytes, elements (in case of arrays) or key-value pairs (in case of objs), which seems completely useless given that this is a textual format that requires me to parse the length segment anyway. This isn't present for integers and decimals.
  • Value of the field
  • Key-value pairs seem to be a flat list of an even number of elements. They also seem to be using arrays as objects as well (see example).
  • A ; terminator, which seems not to be necessary for objects and arrays, just to make parsing more tedious.

Now, parsing this thing is reasonably easy, but I'm constantly being surprised by new data types and their weird syntax and I'm not sure that I've covered all the edge cases with the few data samples I've analyzed. Is anyone familiar with this format?

解决方案

Looks like PHP serialization. See: http://www.phpinternalsbook.com/classes_objects/serialization.html

这篇关于这是什么样的数据格式(冒号和分号分隔的条目)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆