如何更改python数组的编码? [英] How to change the coding for python array?

查看：275 发布时间：2020/9/20 8:08:02 python python-2.7 web-scraping character-encoding beautifulsoup

本文介绍了如何更改python数组的编码?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我使用以下代码从中文网站上抓取表格.它工作正常.但似乎我存储在列表中的内容未正确显示.

I use the following code to scrape a table from a Chinese website. It works fine. But it seems that the contents I stored in the list are not shown properly.

import requests
from bs4 import BeautifulSoup
import pandas as pd

x = requests.get('http://www.sohu.com/a/79780904_126549')
bs = BeautifulSoup(x.text,'lxml')

clg_list = []

for tr in bs.find_all('tr'):
    tds = tr.find_all('td')
    for i in range(len(tds)):
       clg_list.append(tds[i].text)
       print(tds[i].text)

当我打印文本时，它会显示汉字.但是，当我打印出列表时，它显示的是\ u4e00 \ u671f \ uff0834 \ u6240 \ uff09'.我不确定是否应该更改编码或其他错误.

When I print the text, it shows Chinese characters. But when I print out the list, it's showing \u4e00\u671f\uff0834\u6240\uff09'. I am not sure if I should change the encoding or something else is wrong.

推荐答案

在这种情况下没有错.

打印python列表时，python在列表的每个元素上调用repr.在python2中，unicode字符串的repr显示组成该字符串的字符的unicode代码点.

When you print a python list, python calls repr on each of the list's elements. In python2, the repr of a unicode string shows the unicode code points for the characters that make up the string.

>>> c = clg_list[0]
>>> c # Ask the interpreter to display the repr of c
u'\u201c985\u201d\u5de5\u7a0b\u5927\u5b66\u540d\u5355\uff08\u622a\u6b62\u52302011\u5e743\u670831\u65e5\uff09'

但是，如果您使用print字符串，则python将使用文本编码(例如utf-8)对unicode字符串进行编码，并且您的计算机将显示与该编码匹配的字符.

However, if you print the string, python encodes the unicode string with a text encoding (for example, utf-8) and your computer displays the characters that match the encoding.

>>> print c
"985"工程大学名单（截止到2011年3月31日）

请注意，由于python3更好的unicode处理，在python3打印中，列表将按预期显示中文字符.

Note that in python3 printing the list will show the chinese characters as you expect, because of python3's better unicode handling.

这篇关于如何更改python数组的编码?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何更改python数组的编码? [英] How to change the coding for python array?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何更改python数组的编码? [英] How to change the coding for python array?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭