bs4.FeatureNotFound:找不到一棵树建设者您所要求的功能:LXML。你需要安装一个解析器库? [英] bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

查看:21283
本文介绍了bs4.FeatureNotFound:找不到一棵树建设者您所要求的功能:LXML。你需要安装一个解析器库?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

...
soup = BeautifulSoup(html, "lxml")
File "/Library/Python/2.7/site-packages/bs4/__init__.py", line 152, in __init__
% ",".join(features))
bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

在我的终端上面的输出。我在Mac OS 10.7.x.我的Python 2.7.1,并遵循本教程得到美丽的汤和LXML,这既成功安装并与设在这里一个单独的测试文件工作。在Python脚本导致此错误,我已经包括这一行:
    从pageCrawler进口comparePages
而在pageCrawler文件我已经包括以下两行:
    从BS4进口BeautifulSoup
    从进口的urllib2的urlopen

The above outputs on my Terminal. I am on Mac OS 10.7.x. I have Python 2.7.1, and followed this tutorial to get Beautiful Soup and lxml, which both installed successfully and work with a separate test file located here. In the Python script that causes this error, I have included this line: from pageCrawler import comparePages And in the pageCrawler file I have included the following two lines: from bs4 import BeautifulSoup from urllib2 import urlopen

在搞清楚的问题是什么,以及它如何可以解决多少会被AP preciated任何帮助。

Any help in figuring out what the problem is and how it can be solved would much be appreciated.

推荐答案

我怀疑,这是相关的BS将用于读取HTML解析器。他们在这里文件的,但如果你像我一样(在OSX)你可能会坚持的东西,需要做一些工作:

I have a suspicion that this is related the the parser that BS will use to read the HTML. They document it here but if you're like me (on OSX) you might be stuck with something that requires a bit of work:

您会注意到,在BS4文档网页上面,他们指出,在默认情况下BS4将使用内置的HTML解析器Python的。假设你是在OSX,Python中的苹果​​捆绑的版本是2.7.2这是不宽松的字符格式。我打这个同样的问题,所以我用Python版本升级来解决它。在virtualenv中这样做将尽量减少对其他项目。

You'll notice that in the BS4 documentation page above, they point out that by default BS4 will use the Python built-in HTML parser. Assuming you are in OSX, the Apple-bundled version of Python is 2.7.2 which is not lenient for character formatting. I hit this same problem, so I upgraded by version of Python to work around it. Doing this in a virtualenv will minimize disruption to other projects.

如果这样做,听起来像一个痛苦,你可以切换到LXML解析器:

If doing that sounds like a pain, you can switch over to the LXML parser:

pip install lxml

然后再试试:

soup = BeautifulSoup(html, "lxml")

根据您的情况,这可能是不够好。我发现这个够烦人的,以保证升级我的Python版本。使用的virtualenv,可以迁移你的包很容易

这篇关于bs4.FeatureNotFound:找不到一棵树建设者您所要求的功能:LXML。你需要安装一个解析器库?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆