python - 如何处理来自 urllib.request.urlopen() 的响应编码，以避免 TypeError : can't use a string pattern on a bytes-like object

coder 2023-05-22 原文

我正在尝试使用 urllib.request.urlopen() 打开网页，然后使用正则表达式进行搜索，但这会出现以下错误:

TypeError: can't use a string pattern on a bytes-like object

我明白为什么，urllib.request.urlopen() 返回一个字节流，所以 re 不知道要使用的编码。在这种情况下我该怎么办？有没有办法在 urlrequest 中指定编码方法，或者我需要自己重新编码字符串？如果是这样，我想做什么，我假设我应该从 header 信息或编码类型中读取编码(如果在 html 中指定)，然后将其重新编码为？

最佳答案

对于我来说，解决方案如下(python3):

resource = urllib.request.urlopen(an_url)
content =  resource.read().decode(resource.headers.get_content_charset())

关于python - 如何处理来自 urllib.request.urlopen() 的响应编码，以避免 TypeError : can't use a string pattern on a bytes-like object，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/4981977/

python - 如何处理来自 urllib.request.urlopen() 的响应编码，以避免 TypeError : can't use a string pattern on a bytes-like object

有关python - 如何处理来自 urllib.request.urlopen() 的响应编码，以避免 TypeError : can't use a string pattern on a bytes-like object的更多相关文章

随机推荐