BeautifulSoup'有没有属性'HTML_ENTITIES [英] BeautifulSoup' has no attribute 'HTML_ENTITIES

查看:1499
本文介绍了BeautifulSoup'有没有属性'HTML_ENTITIES的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近在Windows计算机上从3.0版本升级BeautifulSoup到4.1版本。

我现在得到一个奇怪的错误:

 文件C:\\路径\\为\\ myscript.py23行
0,在soupify
    返回BeautifulSoup(HTML,convertEntities = BeautifulSoup.HTML_ENTITIES)
AttributeError异常:类型的对象'BeautifulSoup'有没有属性'HTML_ENTITIES

下面是被扔code的导致异常的代码段:

 高清soupify(HTML):
    返回BeautifulSoup(HTML,convertEntities = BeautifulSoup.HTML_ENTITIES)

有关BS的文档没有提及如何构造签名已V3到V4来回改变。我如何可以解决这个问题?


解决方案

  

传入HTML或XML实体总是被转换成
  相应的Uni code字符。美丽汤3的数的
  处理实体重叠的方式,它已被删除。
  的BeautifulSoup构造方法不再承认smartQuotesTo
  或convertEntities参数。
(统一code,该死的还有
  smart_quotes_to,但其默认现在是将智能引号成
  UNI code)


  
  

如果您希望将这些统一code字符回HTML实体
  上输出,而不是把他们变成UTF-8字符,则需要
  使用输出格式


来源: http://www.crummy.com/software/BeautifulSoup / BS4 / DOC /#实体

I have recently upgrade BeautifulSoup from version 3.0 to version 4.1 on a Windows machine.

I am now getting a strange error:

File "C:\path\to\myscript.py", line 23
0, in soupify
    return BeautifulSoup(html, convertEntities=BeautifulSoup.HTML_ENTITIES)
AttributeError: type object 'BeautifulSoup' has no attribute 'HTML_ENTITIES'

Here is the snippet of code that causes the exception to be thrown:

def soupify(html):
    return BeautifulSoup(html, convertEntities=BeautifulSoup.HTML_ENTITIES)

The doc for BS does not mention how the constructor signature has changed fro v3 to v4. How may I fix this?

解决方案

An incoming HTML or XML entity is always converted into the corresponding Unicode character. Beautiful Soup 3 had a number of overlapping ways of dealing with entities, which have been removed. The BeautifulSoup constructor no longer recognizes the smartQuotesTo or convertEntities arguments. (Unicode, Dammit still has smart_quotes_to, but its default is now to turn smart quotes into Unicode.)

If you want to turn those Unicode characters back into HTML entities on output, rather than turning them into UTF-8 characters, you need to use an output formatter.

Source: http://www.crummy.com/software/BeautifulSoup/bs4/doc/#entities

这篇关于BeautifulSoup'有没有属性'HTML_ENTITIES的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆