Python中最宽容的HTML解析器是什么? [英] What’s the most forgiving HTML parser in Python?

查看：133 发布时间：2020/5/4 8:34:41 python html-parsing beautifulsoup lxml pyquery

本文介绍了Python中最宽容的HTML解析器是什么?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一些随机HTML，我使用BeautifulSoup对其进行了解析，但是在大多数情况下(> 70％)，它会阻塞.我尝试使用Beautiful汤3.0.8和3.2.0(3.1.0以上版本存在一些问题)，但结果几乎相同.

I have some random HTML and I used BeautifulSoup to parse it, but in most of the cases (>70%) it chokes. I tried using Beautiful soup 3.0.8 and 3.2.0 (there were some problems with 3.1.0 upwards), but the results are almost same.

我可以从脑海中回想起Python中可用的几个HTML解析器选项:

I can recall several HTML parser options available in Python from the top of my head:

BeautifulSoup
lxml
pyquery

我打算测试所有这些内容，但我想知道测试中哪一个最宽容，甚至可以尝试解析不良的HTML.

I intend to test all of these, but I wanted to know which one in your tests come as most forgiving and can even try to parse bad HTML.

Python中最宽容的HTML解析器是什么? [英] What’s the most forgiving HTML parser in Python?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python中最宽容的HTML解析器是什么? [英] What’s the most forgiving HTML parser in Python?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭