lxml属性需要完整的名称空间 [英] lxml attributes require full namespace

查看：72 发布时间：2020/5/4 8:34:24 python xml excel lxml

本文介绍了lxml属性需要完整的名称空间的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

下面的代码使用lxml(python 3.3)从Excel 2003 XML工作簿中读取一个表.代码可以正常工作，但是为了通过get()方法访问Data元素的Type属性，我需要使用键"{urn:schemas-microsoft-com:office:spreadsheet} Type"-为什么这样做，我已经用ss前缀指定了这个命名空间.

The code below reads the a table from an Excel 2003 XML workbook using lxml (python 3.3). The code works fine, however in order to access the Type attribute of the Data element via the get() method I need to use the key '{urn:schemas-microsoft-com:office:spreadsheet}Type' - why is this, I've specified this namespace with the ss prefix.

我能想到的是这个名称空间在文档中出现了两次，一次是带有名称空间前缀，一次是没有名称.

All I can think of is this namespace appears twice in the document, once with a namespace prefix and once without i.e.

<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:o="urn:schemas-microsoft-com:office:office"
 xmlns:x="urn:schemas-microsoft-com:office:excel"
 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:html="http://www.w3.org/TR/REC-html40">

在文件中，元素和属性的声明如下:-带ss:前缀的Type属性以及不带前缀的Cell和Data元素.但是，声明中说两者都属于同一个架构'urn:schemas-microsoft-com:office:spreadsheet'，因此解析器肯定应该对它们进行同等对待吗?

And in the file the element and attribute are declared as below - The Type attribute with ss: prefix and the Cell and Data element with no prefix. However the declaration says both belong to the same schema 'urn:schemas-microsoft-com:office:spreadsheet' so surely the parser should treat them equivalently?

<Cell><Data ss:Type="String">QB11128020</Data></Cell>

我的代码:

with (open(filename,'r')) as f:
    doc = etree.parse(f)

namespaces={'o':'urn:schemas-microsoft-com:office:office',
            'x':'urn:schemas-microsoft-com:office:excel',
            'ss':'urn:schemas-microsoft-com:office:spreadsheet'}

ws = doc.xpath('/ss:Workbook/ss:Worksheet', namespaces=namespaces)
if len(ws) > 0: 
    tables = ws[0].xpath('./ss:Table', namespaces=namespaces)
    if len(tables) > 0: 
        rows = tables[0].xpath('./ss:Row', namespaces=namespaces)
        for row in rows:
            cells = row.xpath('./ss:Cell/ss:Data', namespaces=namespaces)
            for cell in cells:
                print(cell.text);
                print(cell.keys());
                print(cell.get('{urn:schemas-microsoft-com:office:spreadsheet}Type'));

lxml属性需要完整的名称空间 [英] lxml attributes require full namespace

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

lxml属性需要完整的名称空间 [英] lxml attributes require full namespace

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭