os.walk()python:目录结构的xml表示,递归 [英] os.walk() python: xml representation of a directory structure, recursion
问题描述
因此,我尝试使用os.walk()生成目录结构的XML表示形式。我似乎得到了大量重复的记录。它将目录相互正确放置,并将文件放置在xml文件第一部分的正确位置;但是,正确执行后,它将继续错误地遍历。我不太确定为什么...。
So I am trying to use os.walk() to generate an XML representation of a directory structure. I seem to be getting a ton of duplicates. It properly places directories within each other and files in the right place for the first portion of the xml file; however, after it does it correctly it then continues traversing incorrectly. I am not quite sure why....
这是我的代码:
def dirToXML(self,directory):
curdir = os.getcwd()
os.chdir(directory)
xmlOutput=""
tree = os.walk(directory)
for root, dirs, files in tree:
pathName = string.split(directory, os.sep)
xmlOutput+="<dir><name><![CDATA["+pathName.pop()+"]]></name>"
if len(files)>0:
xmlOutput+=self.fileToXML(files)
for subdir in dirs:
xmlOutput+=self.dirToXML(os.path.join(root,subdir))
xmlOutput+="</dir>"
os.chdir(curdir)
return xmlOutput
fileToXML只需解析列表即可,因此无需担心。
The fileToXML, simply parses out the list so no need to worry about that.
目录结构很简单:
images/
images/testing.xml
images/structure.xml
images/Hellos
images/Goodbyes
images/Goodbyes/foo
images/Goodbyes/bar
images/Goodbyes/square
,生成的xml文件变为:
and the resulting xml file became:
<structure>
<dir>
<name>images</name>
<files>
<file>
<name>structure.xml</name>
</file>
<file>
<name>testing.xml</name>
</file>
</files>
<dir>
<name>Hellos</name>
</dir>
<dir>
<name>Goodbyes</name>
<dir>
<name>foo</name>
</dir>
<dir>
<name>bar</name>
</dir>
<dir>
<name>square</name>
</dir>
</dir>
<dir>
<name>foo</name>
</dir>
<dir>
<name>bar</name>
</dir>
<dir>
<name>square</name>
</dir>
</dir>
<dir>
<name>Hellos</name>
</dir>
<dir>
<name>Goodbyes</name>
<dir>
<name>foo</name>
</dir>
<dir>
<name>bar</name>
</dir>
<dir>
<name>square</name>
</dir>
</dir>
<dir>
<name>foo</name>
</dir>
<dir>
<name>bar</name>
</dir>
<dir>
<name>square</name>
</dir>
</structure>
任何帮助将不胜感激!
推荐答案
我建议不要使用 os.walk()
,因为您必须做很多事情来按摩其输出。相反,只需使用使用 os.listdir()
, os.path.join()
, os.path.isdir()
等。
I'd recommend against using os.walk()
, since you have to do so much to massage its output. Instead, just use a recursive function that uses os.listdir()
, os.path.join()
, os.path.isdir()
, etc.
import os
from xml.sax.saxutils import escape as xml_escape
def DirAsXML(path):
result = '<dir>\n<name>%s</name>\n' % xml_escape(os.path.basename(path))
dirs = []
files = []
for item in os.listdir(path):
itempath = os.path.join(path, item)
if os.path.isdir(itempath):
dirs.append(item)
elif os.path.isfile(itempath):
files.append(item)
if files:
result += ' <files>\n' \
+ '\n'.join(' <file>\n <name>%s</name>\n </file>'
% xml_escape(f) for f in files) + '\n </files>\n'
if dirs:
for d in dirs:
x = DirAsXML(os.path.join(path, d))
result += '\n'.join(' ' + line for line in x.split('\n'))
result += '</dir>'
return result
if __name__ == '__main__':
print '<structure>\n' + DirAsXML(os.getcwd()) + '\n</structure>'
就个人而言,我建议使用不太冗长的XML模式,将名称放在属性中并摆脱<文件>
组:
Personally, I'd recommend a much less verbose XML schema, putting names in attributes and getting rid of the <files>
group:
import os
from xml.sax.saxutils import quoteattr as xml_quoteattr
def DirAsLessXML(path):
result = '<dir name=%s>\n' % xml_quoteattr(os.path.basename(path))
for item in os.listdir(path):
itempath = os.path.join(path, item)
if os.path.isdir(itempath):
result += '\n'.join(' ' + line for line in
DirAsLessXML(os.path.join(path, item)).split('\n'))
elif os.path.isfile(itempath):
result += ' <file name=%s />\n' % xml_quoteattr(item)
result += '</dir>'
return result
if __name__ == '__main__':
print '<structure>\n' + DirAsLessXML(os.getcwd()) + '\n</structure>'
输出如下:
<structure>
<dir name="local">
<dir name=".hg">
<file name="00changelog.i" />
<file name="branch" />
<file name="branch.cache" />
<file name="dirstate" />
<file name="hgrc" />
<file name="requires" />
<dir name="store">
<file name="00changelog.i" />
等。
如果 os.walk()
的工作方式与 expat
的回调类似,您可以轻松地完成工作。
If os.walk()
worked more like expat
's callbacks, you'd have an easier time of it.
这篇关于os.walk()python:目录结构的xml表示,递归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!