将HTML实体转换为Unicode，反之亦然 [英] Convert HTML entities to Unicode and vice versa

查看：301 发布时间：2018/6/13 10:51:39 python html html-entities

本文介绍了将HTML实体转换为Unicode，反之亦然的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

可能存在重复：

在Python中将XML / HTML实体转换为Unicode字符串

HTML实体代码到文本

>解决方案
您需要 BeautifulSoup 。
from BeautifulSoup import BeautifulStoneSoup import cgi def HTMLEntitiesToUnicode（text）：将HTML实体转换为unicode，例如'&'变成'&'。 text = unicode（BeautifulStoneSoup（text，convertEntities = BeautifulStoneSoup.ALL_ENTITIES））返回文本 def unicodeToHTMLEntities（文本）：将unicode转换为HTML实体。例如'&'变成'&'。 text = cgi.escape（text）.encode（'ascii'，'xmlcharrefreplace'）返回文本 text =& amp;;& reg;;& lt ;,& gt ;,& cent ;,;& pound;& yen ;;& euro ;,& sect ;& copy; uni = HTMLEntitiesToUnicode（text） htmlent = unicodeToHTMLEntities（uni） print uni print htmlent &，<，>，¢，£，¥，€，§，© ＃& amp;＃174;& lt;& gt ;, &＃162;&＃163;&＃165;&＃8364 ;,&＃167;&＃169;

Possible duplicates:

Convert XML/HTML Entities into Unicode String in Python

HTML Entity Codes to Text

How do you convert HTML entities to Unicode and vice versa in Python?
解决方案
You need to have BeautifulSoup.
from BeautifulSoup import BeautifulStoneSoup import cgi def HTMLEntitiesToUnicode(text): """Converts HTML entities to unicode. For example '&' becomes '&'.""" text = unicode(BeautifulStoneSoup(text, convertEntities=BeautifulStoneSoup.ALL_ENTITIES)) return text def unicodeToHTMLEntities(text): """Converts unicode to HTML entities. For example '&' becomes '&'.""" text = cgi.escape(text).encode('ascii', 'xmlcharrefreplace') return text text = "&, ®, <, >, ¢, £, ¥, €, §, ©" uni = HTMLEntitiesToUnicode(text) htmlent = unicodeToHTMLEntities(uni) print uni print htmlent # &, ®, <, >, ¢, £, ¥, €, §, © # &, ®, <, >, ¢, £, ¥, €, §, ©

这篇关于将HTML实体转换为Unicode，反之亦然的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

将HTML实体转换为Unicode，反之亦然 [英] Convert HTML entities to Unicode and vice versa

问题描述

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

将HTML实体转换为Unicode，反之亦然 [英] Convert HTML entities to Unicode and vice versa

问题描述

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭