使用Python解析Alexa XML [英] Parsing Alexa XML with Python
本文介绍了使用Python解析Alexa XML的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个非常类似的问题:
python alexa结果我正在想知道如何解析第二个 DataUrl
的lxml.etree 。
I have a pretty similiar question to: python alexa result parsing with lxml.etree.
这意味着我想获取 DataUrl
变量,该变量位于 TrafficData
下,而不是下的变量ContentData
。 (获取 people.com
而不是 google.com
)
I'm wondering how to parse the second DataUrl
. That means I want to get the DataUrl
variable which is under TrafficData
and not the one under ContentData
. (get people.com
and not google.com
)
我还在使用lxml,它的数据与他描述的完全相同。
I'm also using lxml with the exact same data as he described.
这是代码:
<aws:UrlInfoResponse xmlns:aws="http://alexa.amazonaws.com/doc/2005-10-05/">
<aws:Response xmlns:aws="http://awis.amazonaws.com/doc/2005-07-11">
<aws:OperationRequest>
<aws:RequestId>ccf3f263-ab76-ab63-db99-244666044e85</aws:RequestId>
</aws:OperationRequest>
<aws:UrlInfoResult>
<aws:Alexa>
<aws:ContentData>
<aws:DataUrl type="canonical">google.com/</aws:DataUrl>
<aws:SiteData>
<aws:Title>Google</aws:Title>
<aws:Description>Enables users to search the world's information, including webpages, images, and videos. Offers unique features and search technology.</aws:Description>
<aws:OnlineSince>15-Sep-1997</aws:OnlineSince>
</aws:SiteData>
<aws:LinksInCount>3453627</aws:LinksInCount>
</aws:ContentData>
<aws:TrafficData>
<aws:DataUrl type="canonical">people.com/</aws:DataUrl>
<aws:Rank>1</aws:Rank>
</aws:TrafficData>
</aws:Alexa>
</aws:UrlInfoResult>
<aws:ResponseStatus xmlns:aws="http://alexa.amazonaws.com/doc/2005-10-05/">
<aws:StatusCode>Success</aws:StatusCode>
</aws:ResponseStatus>
</aws:Response>
</aws:UrlInfoResponse>
推荐答案
我需要做:
namespaces = {"aws": "http://awis.amazonaws.com/doc/2005-07-11"}
texts = doc.xpath("//aws:TrafficData/aws:DataUrl/text()", namespaces=namespaces)
print texts[0]
这篇关于使用Python解析Alexa XML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文