无法在python中导入样板 [英] Trouble importing boilerpipe in python

查看:133
本文介绍了无法在python中导入样板的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用python构建一个应用程序,其中涉及从RSS feed获取新闻文章.作为我项目的一部分,我决定使用样板程序,以便仅从出现文章的html页面中提取文章内容.

I'm building an application using python which involves getting news articles from RSS feeds. As part of my project, I have decided to use boilerpipe in order to extract just the article content from the html page on which the article appears.

尽管boilerpipe最初是为java编写的,但它也已移植到python.您可以在github上查看其页面: https://github.com/misja/python-boilerpipe

Although boilerpipe was originally written for java, it has been ported to python too. You can see its page on github here: https://github.com/misja/python-boilerpipe

问题是尝试使用以下命令导入时出现异常:

The problem is that I get an exception when trying to import it using:

from boilerpipe.extract import Extractor

我得到的错误是:

Traceback (most recent call last):
File "", line 1, in
File "build\bdist.win32\egg\boilerpipe\extract__init__.py", line 12, in
File "C:\Python26\lib\site-packages\jpype_jclass.py", line 54, in JClass
raise _RUNTIMEEXCEPTION.PYEXC("Class %s not found" % name)
jpype._jexception.ExceptionPyRaisable: java.lang.Exception: Class 
de.l3s.boilerpipe.sax.HTMLHighlighter not found

什么可能导致此问题,该如何解决?

What might be causing this problem and how can I fix it?

推荐答案

这在Mac OS X 10.8.5和Python 2.7.9上对我有用.

This worked for me on Mac OS X 10.8.5 with Python 2.7.9.:

pip install JPype1    # to install https://pypi.python.org/pypi/JPype1
pip install charade
git clone https://github.com/misja/python-boilerpipe.git
cd python-boilerpipe
sudo python setup.py install

那么您应该可以在python控制台中完成

Then you should be able to do in the python console

>>> from boilerpipe.extract import Extractor
>>> extractor = Extractor(extractor='ArticleExtractor', url="http://en.wikipedia.org/wiki/Main_Page")
>>> print extractor.getText()

这篇关于无法在python中导入样板的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆