如何处理来自urllib.request.urlopen()的响应编码，以避免TypeError:无法在类似字节的对象上使用字符串模式 [英] How to handle response encoding from urllib.request.urlopen() , to avoid TypeError: can't use a string pattern on a bytes-like object

查看：77 发布时间：2021/5/4 19:14:45 python regex encoding urllib

本文介绍了如何处理来自urllib.request.urlopen()的响应编码，以避免TypeError:无法在类似字节的对象上使用字符串模式的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用 urllib.request.urlopen()打开网页，然后使用正则表达式进行搜索，但这会出现以下错误:

I'm trying to open a webpage using urllib.request.urlopen() then search it with regular expressions, but that gives the following error:

TypeError:无法在类似字节的对象上使用字符串模式

TypeError: can't use a string pattern on a bytes-like object

我知道为什么， urllib.request.urlopen()返回一个字节流，所以 re 不知道要使用的编码.在这种情况下我该怎么办?是否可以在urlrequest中指定编码方法，或者我需要自己重新编码字符串?如果是这样，我想我应该从标题信息中读取编码，或者如果在html中指定了编码类型，则应读取编码，然后将其重新编码为该编码?

I understand why, urllib.request.urlopen() returns a bytestream, so re doesn't know the encoding to use. What am I supposed to do in this situation? Is there a way to specify the encoding method in a urlrequest maybe or will I need to re-encode the string myself? If so what am I looking to do, I assume I should read the encoding from the header info or the encoding type if specified in the html and then re-encode it to that?

推荐答案

您只需要使用 Content-Type 标头(通常是最后一个值)对响应进行解码即可.在教程中也给出了一个示例.

You just need to decode the response, using the Content-Type header typically the last value. There is an example given in the tutorial too.

output = response.decode('utf-8')

这篇关于如何处理来自urllib.request.urlopen()的响应编码，以避免TypeError:无法在类似字节的对象上使用字符串模式的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何处理来自urllib.request.urlopen()的响应编码，以避免TypeError:无法在类似字节的对象上使用字符串模式 [英] How to handle response encoding from urllib.request.urlopen() , to avoid TypeError: can't use a string pattern on a bytes-like object

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何处理来自urllib.request.urlopen()的响应编码，以避免TypeError:无法在类似字节的对象上使用字符串模式 [英] How to handle response encoding from urllib.request.urlopen() , to avoid TypeError: can&#39;t use a string pattern on a bytes-like object

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

如何处理来自urllib.request.urlopen()的响应编码，以避免TypeError:无法在类似字节的对象上使用字符串模式 [英] How to handle response encoding from urllib.request.urlopen() , to avoid TypeError: can't use a string pattern on a bytes-like object

登录关闭