OSError: [Errno 36] 文件名太长: [英] OSError: [Errno 36] File name too long:
问题描述
我需要将网页转换为 XML(使用 Python 3.4.3
).如果我将 URL 的内容写入文件,那么我可以完美地读取和解析它,但是如果我尝试直接从网页读取,我的终端会出现以下错误:
文件./AnimeXML.py",第 22 行,在xml = ElementTree.parse (xmlData)文件/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/xml/etree/ElementTree.py",第 1187 行,解析tree.parse(源代码,解析器)文件/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/xml/etree/ElementTree.py",第 587 行,在解析中源 = 开放(源,RB")OSError: [Errno 36] 文件名太长:
我的python代码:
# AnimeXML.py#!/usr/bin/Python# 导入xml解析器.导入 xml.etree.ElementTree 作为 ElementTree#要解析的XML.sampleUrl = "http://cdn.animenewsnetwork.com/encyclopedia/api.xml?anime=16989"# 将 xml 读取为文件.内容 = urlopen (sampleUrl)# XML 内容存储在这里以开始处理它.xmlData = content.readall().decode('utf-8')# 关闭文件.内容.关闭()# 开始解析 XML.xml = ElementTree.parse (xmlData)# 获取 XML 文件的根目录.根 = xml.getroot()root.iter("info") 中的信息:打印(信息属性)
有什么办法可以修复我的代码,以便我可以将网页直接读入 python 而不会出现此错误?
如 ElementTree
文档的解析 XML 部分:
我们可以通过从文件中读取来导入这些数据:
导入 xml.etree.ElementTree 作为 ETtree = ET.parse('country_data.xml')root = tree.getroot()
<块引用>
或者直接从一个字符串:
root = ET.fromstring(country_data_as_string)
您将整个 XML 内容作为一个巨大的路径名传递.您的 XML 文件可能大于 2K,或者您的平台的最大路径名大小,因此会出现错误.如果不是,您只会得到一个不同的错误,提示没有名为 [XML 文件中第一个/之前的所有内容]
.
只需使用 fromstring
而不是 parse
.
或者,请注意parse
可以接受一个文件对象,而不仅仅是一个文件名.urlopen
返回的东西 是一个文件对象.
还要注意该部分的下一行:
<块引用>fromstring()
将 XML 从字符串直接解析为 Element
,它是解析树的根元素.其他解析函数可能会创建一个 ElementTree
.
所以,你也不想要 root = tree.getroot()
.
所以:
# ...内容.关闭()root = ElementTree.fromstring(xmlData)
I need to convert a web page to XML (using Python 3.4.3
). If I write the contents of the URL to a file then I can read and parse it perfectly but if I try to read directly from the web page I get the following error in my terminal:
File "./AnimeXML.py", line 22, in xml = ElementTree.parse (xmlData) File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/xml/etree/ElementTree.py", line 1187, in parse tree.parse(source, parser) File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/xml/etree/ElementTree.py", line 587, in parse source = open(source, "rb") OSError: [Errno 36] File name too long:
My python code:
# AnimeXML.py
#! /usr/bin/Python
# Import xml parser.
import xml.etree.ElementTree as ElementTree
# XML to parse.
sampleUrl = "http://cdn.animenewsnetwork.com/encyclopedia/api.xml?anime=16989"
# Read the xml as a file.
content = urlopen (sampleUrl)
# XML content is stored here to start working on it.
xmlData = content.readall().decode('utf-8')
# Close the file.
content.close()
# Start parsing XML.
xml = ElementTree.parse (xmlData)
# Get root of the XML file.
root = xml.getroot()
for info in root.iter("info"):
print (info.attrib)
Is there any way I can fix my code so that I can read the web page directly into python without getting this error?
As explained in the Parsing XML section of the ElementTree
docs:
We can import this data by reading from a file:
import xml.etree.ElementTree as ET
tree = ET.parse('country_data.xml')
root = tree.getroot()
Or directly from a string:
root = ET.fromstring(country_data_as_string)
You're passing the whole XML contents as a giant pathname. Your XML file is probably bigger than 2K, or whatever the maximum pathname size is for your platform, hence the error. If it weren't, you'd just get a different error about there being no directory named [everything up to the first / in your XML file]
.
Just use fromstring
instead of parse
.
Or, notice that parse
can take a file object, not just a filename. And the thing returned by urlopen
is a file object.
Also notice the very next line in that section:
fromstring()
parses XML from a string directly into anElement
, which is the root element of the parsed tree. Other parsing functions may create anElementTree
.
So, you don't want that root = tree.getroot()
either.
So:
# ...
content.close()
root = ElementTree.fromstring(xmlData)
这篇关于OSError: [Errno 36] 文件名太长:的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!