将 XML 转换为特定格式的嵌套 JSON 对象 [英] Convert XML to nested JSON object in specific format
问题描述
这是我之前提出的关于从 XML 节点导出完全扁平结构的问题的后续:将 xml 文档转换为特定的点扩展 json 结构.
假设我有相同的 XML 开头:
- <主要><平台>iTunes</平台><PlatformID>353736518</PlatformID></主要><流派><流派 FacebookID="6003161475030">喜剧</流派><Genre FacebookID="6003172932634">电视节目</Genre></流派><产品><产品国家=CA"><URL>https://itunes.apple.com/ca/tv-season/id353187108?i=353736518</URL><优惠><要约类型="HDBUY"><价格>3.49</价格><货币>加元</货币></优惠><优惠类型="SDBUY"><价格>2.49</价格><货币>加元</货币></优惠></优惠></产品><产品国家=FR"><URL>https://itunes.apple.com/fr/tv-season/id353187108?i=353736518</URL><Rating>Tout public</Rating><优惠><要约类型="HDBUY"><价格>2.49</价格><货币>欧元</货币></优惠><优惠类型="SDBUY"><价格>1.99</价格><货币>欧元</货币></优惠></优惠></产品></产品></项目>
现在我想将它转换为特定格式的嵌套 json 对象(与 xmltodict
库略有不同.这是我想导出的结构:
<代码>{项目[@ID]":288917,"Item.Main.Platform": "iTunes","Item.Main.PlatformID": "353736518",项目.流派":[{"[@FacebookID]": "6003161475030",价值":喜剧"},{"[@FacebookID]": "6003161475030",价值":电视节目"}],项目.产品":[{"[@Country]": "CA","URL": "https://itunes.apple.com/ca/tv-season/id353187108?i=353736518",优惠.优惠":[{"[@Type]": "HDBUY","价格": "3.49",货币":加元"}{"[@Type]": "SDBUY","价格": "2.49",货币":加元"}]},{"[@Country]": "FR","URL": "https://itunes.apple.com/fr/tv-season/id353187108?i=353736518",优惠.优惠":[{"[@Type]": "HDBUY","价格": "3.49",货币":欧元"}{"[@Type]": "SDBUY","价格": "1.99",货币":欧元"}]}]}
主要区别在于,不是将所有内容折叠为平面值列表,而是允许使用字典列表.这怎么可能?
虽然执行上述操作可能是一个不错的挑战,xmltodic
已经在这方面做得很好,只需稍加改动就可以完成这项工作.
以下是要在 xmltodict
中进行的更改:
- 从
#text 更改 var cdata_key
到Value
. - 从
@<更改 var attr_prefix/code> 到
[@
. - 向 init 方法添加新的 var
attr_suffix=']'
. - 将 attr_key 更改为
key = self.attr_prefix+self._build_name(key)+self.attr_suffix
.
这应该会给出您正在寻找的经过测试的模块的确切结果:
<预><代码>>>>从 lxml 导入 etree>>>导入 xmltodict>>>导入json>>>从 utils 导入 xmltodict>>>节点= etree.fromstring(s)>>>d=xmltodict.parse(etree.tostring(node))>>>打印(json.dumps(d,缩进= 4)){物品": {"[@ID]": "288917",主要的": {"平台": "iTunes",平台ID":353736518"},流派":{类型": [{"[@FacebookID]": "6003161475030",价值":喜剧"},{"[@FacebookID]": "6003172932634",价值":电视节目"}]},产品": {产品": [{"[@Country]": "CA","URL": "https://itunes.apple.com/ca/tv-season/id353187108?i=353736518",优惠":{提供": [{"[@Type]": "HDBUY","价格": "3.49",货币":加元"},{"[@Type]": "SDBUY","价格": "2.49",货币":加元"}]}},{"[@Country]": "FR","URL": "https://itunes.apple.com/fr/tv-season/id353187108?i=353736518","Rating": "Tout public",优惠":{提供": [{"[@Type]": "HDBUY","价格": "2.49",货币":欧元"},{"[@Type]": "SDBUY","价格": "1.99",货币":欧元"}]}}]}}}This is a follow up to the question I previously asked about deriving a totally flat structure out of an XML node: Converting an xml doc into a specific dot-expanded json structure.
Suppose I have the same XML to start with:
<Item ID="288917">
<Main>
<Platform>iTunes</Platform>
<PlatformID>353736518</PlatformID>
</Main>
<Genres>
<Genre FacebookID="6003161475030">Comedy</Genre>
<Genre FacebookID="6003172932634">TV-Show</Genre>
</Genres>
<Products>
<Product Country="CA">
<URL>https://itunes.apple.com/ca/tv-season/id353187108?i=353736518</URL>
<Offers>
<Offer Type="HDBUY">
<Price>3.49</Price>
<Currency>CAD</Currency>
</Offer>
<Offer Type="SDBUY">
<Price>2.49</Price>
<Currency>CAD</Currency>
</Offer>
</Offers>
</Product>
<Product Country="FR">
<URL>https://itunes.apple.com/fr/tv-season/id353187108?i=353736518</URL>
<Rating>Tout public</Rating>
<Offers>
<Offer Type="HDBUY">
<Price>2.49</Price>
<Currency>EUR</Currency>
</Offer>
<Offer Type="SDBUY">
<Price>1.99</Price>
<Currency>EUR</Currency>
</Offer>
</Offers>
</Product>
</Products>
</Item>
Now I would like to convert it into a nested json object in a specific format (slightly different than the xmltodict
library. Here is the structure I'd like to derive:
{
"Item[@ID]": 288917,
"Item.Main.Platform": "iTunes",
"Item.Main.PlatformID": "353736518",
"Item.Genres": [
{
"[@FacebookID]": "6003161475030",
"Value": "Comedy"
},
{
"[@FacebookID]": "6003161475030",
"Value": "TV-Show"
}
],
"Item.Products": [
{
"[@Country]": "CA",
"URL": "https://itunes.apple.com/ca/tv-season/id353187108?i=353736518",
"Offers.Offer": [
{
"[@Type]": "HDBUY",
"Price": "3.49",
"Currency": "CAD"
}
{
"[@Type]": "SDBUY",
"Price": "2.49",
"Currency": "CAD"
}
]
},
{
"[@Country]": "FR",
"URL": "https://itunes.apple.com/fr/tv-season/id353187108?i=353736518",
"Offers.Offer": [
{
"[@Type]": "HDBUY",
"Price": "3.49",
"Currency": "EUR"
}
{
"[@Type]": "SDBUY",
"Price": "1.99",
"Currency": "EUR"
}
]
}
]
}
The main difference being instead of collapsing everything into a list of flat values, to allow lists of dicts. How could this be done?
While doing the above might be a nice challenge, xmltodic
already does a great job with this and can do the job with slight altering.
And here are the changes to make in xmltodict
:
- Change var cdata_key from
#text
toValue
. - Change var attr_prefix from
@
to[@
. - Add new var
attr_suffix=']'
to init method. - Change attr_key to
key = self.attr_prefix+self._build_name(key)+self.attr_suffix
.
That should give the exact result you're looking for with a tested module:
>>> from lxml import etree
>>> import xmltodict
>>> import json
>>> from utils import xmltodict
>>> node= etree.fromstring(s)
>>> d=xmltodict.parse(etree.tostring(node))
>>> print(json.dumps(d, indent=4))
{
"Item": {
"[@ID]": "288917",
"Main": {
"Platform": "iTunes",
"PlatformID": "353736518"
},
"Genres": {
"Genre": [
{
"[@FacebookID]": "6003161475030",
"Value": "Comedy"
},
{
"[@FacebookID]": "6003172932634",
"Value": "TV-Show"
}
]
},
"Products": {
"Product": [
{
"[@Country]": "CA",
"URL": "https://itunes.apple.com/ca/tv-season/id353187108?i=353736518",
"Offers": {
"Offer": [
{
"[@Type]": "HDBUY",
"Price": "3.49",
"Currency": "CAD"
},
{
"[@Type]": "SDBUY",
"Price": "2.49",
"Currency": "CAD"
}
]
}
},
{
"[@Country]": "FR",
"URL": "https://itunes.apple.com/fr/tv-season/id353187108?i=353736518",
"Rating": "Tout public",
"Offers": {
"Offer": [
{
"[@Type]": "HDBUY",
"Price": "2.49",
"Currency": "EUR"
},
{
"[@Type]": "SDBUY",
"Price": "1.99",
"Currency": "EUR"
}
]
}
}
]
}
}
}
这篇关于将 XML 转换为特定格式的嵌套 JSON 对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!