将XML转换为Ruby哈希时保留属性 [英] Keeping attributes when converting XML to Ruby hash
问题描述
我有一个很大的XML文档,我正在分析。在这个文档中,许多标签都有不同的属性。例如:
I have a large XML document I am looking to parse. In this document, many tags have different attributes within them. For example:
<album>
<song-name type="published">Do Re Mi</song-name>
</album>
目前,我正在使用Rail的哈希分析库,要求'active_support / core_ext / hash'
。
Currently, I am using Rail's hash-parsing library by requiring 'active_support/core_ext/hash'
.
当我将它转换为散列值时,它会删除属性。它返回:
When I convert it to a hash, it drops the attributes. It returns:
{"album"=>{"song-name"=>"Do Re Mi"}}
如何维护这些属性,在这种情况下, type =发布
属性?
How do I maintain those attributes, in this case, the type="published"
attribute?
这似乎是以前在转换为from_xml哈希时如何使用XML属性? ,但没有确定的答案,但那是从2010年开始的,而且我很好奇自从那时起情况发生了变化。或者,我想知道是否知道解析此XML的另一种方法,以便我仍然可以包含属性信息。
This seems to have been previously been asked in "How can I use XML attributes when converting into a hash with from_xml?", which had no conclusive answer, but that was from 2010, and I'm curious if things have changed since then. Or, I wonder if you know of an alternative way of parsing this XML so that I could still have the attribute information included.
推荐答案
将XML转换为散列并不是一个好的解决方案。您留下的哈希值比原始XML更难解析。另外,如果XML太大,你将留下一个散列,不适合内存,不能被处理,而原始的XML可以使用SAX解析器进行分析。
Converting XML to a hash isn't a good solution. You're left with a hash that is more difficult to parse than the original XML. Plus, if the XML is too big, you'll be left with a hash that won't fit into memory, and can't be processed, whereas the original XML could be parsed using a SAX parser.
假设文件在加载时不会压倒你的内存,我建议使用 Nokogiri
Assuming the file isn't going to overwhelm your memory when loaded, I'd recommend using Nokogiri to parse it, doing something like:
require 'nokogiri'
class Album
attr_reader :song_name, :song_type
def initialize(song_name, song_type)
@song_name = song_name
@song_type = song_type
end
end
xml = <<EOT
<xml>
<album>
<song-name type="published">Do Re Mi</song-name>
</album>
<album>
<song-name type="unpublished">Blah blah blah</song-name>
</album>
</xml>
EOT
albums = []
doc = Nokogiri::XML(xml)
doc.search('album').each do |album|
song_name = album.at('song-name')
albums << Album.new(
song_name.text,
song_name['type']
)
end
puts albums.first.song_name
puts albums.last.song_type
输出:
Which outputs:
Do Re Mi
unpublished
代码首先定义一个合适的对象来保存你想要的数据。当XML被解析为DOM时,代码将遍历所有< album>
节点,并提取信息,定义该类的一个实例,并将其附加到到专辑
数组。
The code starts by defining a suitable object to be used to hold the data you want. When the XML is parsed into a DOM, the code will loop through all the <album>
nodes, and extract the information, defining an instance of the class, and appending it to the albums
array.
运行后,您将拥有一个数组,将其存储到数据库中,或者根据需要操作它。但是,如果您的目标是将该信息插入到数据库中,那么让DBM读取XML并直接导入它会更聪明。
After running you'd have an array you would walk, and process each item, storing it into a database, or manipulating it however you want. Though, if your goal is to insert that information into a database, you'd be smarter to let the DBM read the XML and import it directly.
这篇关于将XML转换为Ruby哈希时保留属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!