'ascii'编解码器在执行bs时无法解码字节0xcb [英] 'ascii' codec can't decode byte 0xcb while doing bs

查看：166 发布时间：2020/9/20 8:20:34 python xml file encoding beautifulsoup

本文介绍了'ascii'编解码器在执行bs时无法解码字节0xcb的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我从Merriam-Webster的API在本地保存xml页面，让我给您提供以下网址: http://www .dictionaryapi.com/api/v1/references/collegiate/xml/apple?key = bf534d02-bf4e-49bc-b43f-37f68a0bf4fd

I save the xml page locally from an API of Merriam-Webster, let me give you the url: http://www.dictionaryapi.com/api/v1/references/collegiate/xml/apple?key=bf534d02-bf4e-49bc-b43f-37f68a0bf4fd

那是一个例子. 我从网址中进行网址检索并将其另存为xml文件.

That was an example. I urlretrieve it from the url and save it as a xml file.

现在我想打开它，但出现UnicodeDecodeError.

Now I want to open it but a UnicodeDecodeError occurs.

我做到了:

page = open('test.xml')
bs = BeautifulSoup(page)

然后发生以下错误:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xcb

我试图将网址u'test.xml'设置为无效.

I tried to make the url u'test.xml' it didn't work.

sys.getdefaultencoding() 'utf-8'

编码配置已经是utf-8，仍然无法解决问题，仍然感谢您的建议.

The encoding configuration is already utf-8, which doesn't solve the problem, thanks for the advice anyway.

推荐答案

您需要将编码指定为utf-8，即数据编码的方式，文件名与内部内容无关，因此以u为前缀制作unicode字符串将无济于事:

You need to specify the encoding as utf-8 which is what the data is encoded as, the filename has nothing to do with what is inside so prefixing with u to make a unicode string is not going to help:

import io
with io.open('test.xml', encoding="utf-8") as page:
      bs = BeautifulSoup(page)

这篇关于'ascii'编解码器在执行bs时无法解码字节0xcb的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

'ascii'编解码器在执行bs时无法解码字节0xcb [英] 'ascii' codec can't decode byte 0xcb while doing bs

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

'ascii'编解码器在执行bs时无法解码字节0xcb [英] &#39;ascii&#39; codec can&#39;t decode byte 0xcb while doing bs

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

'ascii'编解码器在执行bs时无法解码字节0xcb [英] 'ascii' codec can't decode byte 0xcb while doing bs

登录关闭