Python和字符串重音 [英] Python and string accents

查看:123
本文介绍了Python和字符串重音的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在制作卷筒纸刮板.
我访问Google搜索,得到网页的链接,然后得到<title>标记的内容.
问题是,例如,字符串"P\xe1gina N\xe3o Encontrada!"应该为"Página Não Encontrada!". 我尝试将其解码为latin-1,然后编码为utf-8,但没有用.

I am making a web scraper.
I access google search, I get the link of the web page and then I get the contents of the <title> tag.
The problem is that, for example, the string "P\xe1gina N\xe3o Encontrada!" should be "Página Não Encontrada!". I tried do decode to latin-1 and then encode to utf-8 and it did not work.

    r2 = requests.get(item_str)
    texto_pagina = r2.text
    soup_item = BeautifulSoup(texto_pagina,"html.parser")
    empresa = soup_item.find_all("title")
    print(empresa_str.decode('latin1').encode('utf8'))

可以帮我吗? 谢谢!

推荐答案

您可以将检索到的文本变量更改为以下内容:

You can change the retrieved text variable to something like:

string = u'P\xe1gina N\xe3o Encontrada!'.encode('utf-8')

打印string后,它似乎对我来说很好.

After printing string it seemed to work just fine for me.

修改

您是否仅尝试使用empresa_str.decode('latin1')而不是添加.encode('utf8')?

Instead of adding .encode('utf8'), have you tried just using empresa_str.decode('latin1')?

如:

string = empresa_str.decode('latin_1')

这篇关于Python和字符串重音的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆