python将中文编码为特殊字符 [英] python encoding chinese to special character
问题描述
我有从其他网站获取html的scrap/curl请求,这些网站有中文,但是一些文本结果很奇怪,它显示如下:
I have scrap/curl request to get html from other site, that have chinese language but some text result is weird, it showing like this:
ÀÔÆËËËಲ²²²²²²ÆpÖÁ´Ê×ÊÁ±í´øøÅÅÏ¢£¬Çë·ÃÊ°¢Àï°Í°ÍÅú·¢Íø££
°¢Àï°Í°ÍΪÄúÌṩÁË×ÔÁµÕß¹¤³§Ö±ÏúÆ·ÅƵç×Ó±í ÖÇÄÜʱÉг±Á÷ŮʿÊÖ»·ÊÖÁ´Ê×Êαí´øµÈ²úÆ·£¬ÕâÀïÔƼ¯ÁËÖÚ¶àµÄ¹©Ó¦ÉÌ£¬²É¹ºÉÌ£¬ÖÆÔìÉÌ¡£ÓûÁ˽â¸ü¶à×ÔÁµÕß¹¤³§Ö±ÏúÆ·ÅƵç×Ó±í ÖÇÄÜʱÉг±Á÷ŮʿÊÖ»·ÊÖÁ´Ê×Êαí´øÐÅÏ¢£¬Çë·ÃÎÊ°¢Àï°Í°ÍÅú·¢Íø£¡
应该是中文,这是我的代码:
that should be in chinese language, and this is my code:
str(result.decode('ISO-8859-1'))
如果不解码"ISO-8859-1"(仅返回结果变量),它将显示如下问号:
If without decode 'ISO-8859-1' (only return result variable) it will display question mark like this:
Ͱ Ϊ ʱ г Ůʿ ֻ˽ ֱֱֱֱƵƵгггϢ>
您能帮我应该使用哪种编码/解码吗?
Could you help me which encode/decode that I should use?
谢谢
推荐答案
这是一个非常简单的解决方案,正如@Thu Yein tun提到的那样,它可以查看http请求链接的标题响应以显示内容类型,我在此显示作为text/html; charset = GBK,然后我将解决方案提供给这样的代码
It was really simple solution, as mentioned by @Thu Yein tun, to see the header response of the http request link for the content type, and I it showing as text/html;charset=GBK, then I give the solution to my code like this
result.decode('gbk')
这篇关于python将中文编码为特殊字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!