使用lxml为Wordpress Importer编写自定义XML文件 [英] Writing a custom XML file for the Wordpress Importer using lxml

查看:81
本文介绍了使用lxml为Wordpress Importer编写自定义XML文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好的,这是我目前的情况:

我对XML或lxml的了解还不是很好,因为到目前为止我很少使用XML文件. 因此,请告诉我在执行此操作时是否真的很愚蠢. ;-)

我想使用Wordpress导入器为我的Wordpress安装提供一个自定义XML文件. 可以在此处看到默认格式: XML文件

现在有些标签看起来像这样

<wp:author>

我不确定100%,但是据我所知,wp:标签的一部分是名称空间.

当我尝试使用lxml创建那些标签时,我这样做了

author = etree.Element("wp:author")

这引起一个错误,因为不允许我写wp:author,而只能写作者. 我使用Google,查看了lxml网站,并提出了以下内容:

WP = ElementMaker(namespace="http://wordpress.org/export/1.2/",
                  "nsmap={'wp' : "http://wordpress.org/export/1.2/"})
author = WP("author")

输出:

<wp:author xmlns:wp="http://wordpress.org/export/1.2/"/>

好吧.正如我今天所学到的,xmlns:wp属于名称空间.但是我不希望出现xmlns:wp东西,因为它不在他们的XML文件中.我查看了Wordpress本身是如何导出其内容的,它们就像 http://goo.gl/8FVto

构建元素就是这么简单:

a = ET.Element('wp:author')
ET.dump(a)

然后添加一些子元素.全部在文档中.

Okay, so here is my current situation:

My knowledge of XML or lxml isn't very good yet, since I rarely used XML files until now. So please tell me if something in my approach to this is really stupid. ;-)

I want to feed my Wordpress installation a custom XML file, using the Wordpress importer. The Default Format can be seen here: XML File

Now there are some tags looking like this

<wp:author>

I am not a hundred percent sure, but as far as I learned today, the wp: part of the tag is the namespace.

When I tried to use lxml to create those Tags I did this

author = etree.Element("wp:author")

This caused an Error, because I am not allowed to write wp:author, but only author. I used Google, looked upon the lxml website, and came up with this:

WP = ElementMaker(namespace="http://wordpress.org/export/1.2/",
                  "nsmap={'wp' : "http://wordpress.org/export/1.2/"})
author = WP("author")

Output:

<wp:author xmlns:wp="http://wordpress.org/export/1.2/"/>

Well, better. The xmlns:wp belongs to the namespace stuff, as I learned today. But I don't want the xmlns:wp stuff to appear because it doesn't in their XML File. I looked up how Wordpress itself exports their content, and they do it like this:

echo '<wp:author_id>' . $author->ID . '</wp:author_id>';

Now my Question, is it better to do the same like them, or should I stick to lxml, as long as there is a way to get a tag without the xmlns:wp stuff? Using lxml to create XML files seems to be the better approach, because it seems to be (normally) pretty easy and is better to read.

I already tried objectify.deannotate, cleanup_namespace and similar suggestions but all of these don't work. I hope some of you have an answer, either to suggesting a solution to my problem using lxml or by saying, better to do it the way the Wordpress people do!

If I have overlooked an already answered similar Question, I am really sorry and please tell me so.

Thank you Vaelor

解决方案

Here is my advice: Take a step back from lxml and consider the python built-in support for xml processing: a module called xml.etree.ElementTree. Import it in repl like this:

import xml.etree.ElementTree as ET

and play with it for a while. Here is a good python documentation on the module: http://goo.gl/8FVto

Building an element is as simple as that:

a = ET.Element('wp:author')
ET.dump(a)

Then add some sub-elements. It's all in the docs.

这篇关于使用lxml为Wordpress Importer编写自定义XML文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆