CDATA元素的XML解析 [英] XML parsing of a CDATA element
问题描述
我想解析包含以下格式的CDATA元素的xml
I want to parse xml which contains a CDATA element in the following format
<showtimes><![CDATA[6:50 PM,https://www.movietickets.com/purchase.asp?afid=rgncom&house_id=6446&language=2&movie_id=87050&perft=18:50&perfd=03012011,9:40 PM,https://www.movietickets.com/purchase.asp?afid=rgncom&house_id=6446&language=2&movie_id=87050&perft=21:40&perfd=03012011]]> </showtimes>
请帮助我找出解决方案。
Please help me to find out a solution.
推荐答案
这应该没问题-例如使用lxml:
This shouldn't be any problem - e.g. with lxml:
from lxml import etree
input = '<showtimes><![CDATA[6:50 PM,https://www.movietickets.com/purchase.asp?afid=rgncom&house_id=6446&language=2&movie_id=87050&perft=18:50&perfd=03012011,9:40 PM,https://www.movietickets.com/purchase.asp?afid=rgncom&house_id=6446&language=2&movie_id=87050&perft=21:40&perfd=03012011]]> </showtimes>'
f = etree.fromstring(input)
for s in f.xpath("//showtimes"):
print s.text
...打印:
6:50 PM,https://www.movietickets.com/purchase.asp?afid = rgncom&house_id = 6446& language = 2& movie_id = 87050& perft = 18:50& perfd = 03012011,9:下午40点,https://www.movietickets.com/purchase.asp?afid = rgncom&house_id = 6446& language = 2& movie_id = 87050& perft = 21:40&perfd = 03012011
6:50 PM,https://www.movietickets.com/purchase.asp?afid=rgncom&house_id=6446&language=2&movie_id=87050&perft=18:50&perfd=03012011,9:40 PM,https://www.movietickets.com/purchase.asp?afid=rgncom&house_id=6446&language=2&movie_id=87050&perft=21:40&perfd=03012011
这篇关于CDATA元素的XML解析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!