Beautifulsoup功能在特定情况下无法正常工作 [英] Beautifulsoup functionality not working properly in specific scenario

查看：101 发布时间：2020/9/20 8:24:40 python beautifulsoup urllib2 html5lib

本文介绍了Beautifulsoup功能在特定情况下无法正常工作的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用urllib2读取以下URL: http://frcwest.com/，然后搜索元重定向的数据.

I am trying to read in the following url using urllib2: http://frcwest.com/ and then search the data for the meta redirect.

它将读取以下数据:

   <!--?xml version="1.0" encoding="UTF-8"?--><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
   <html xmlns="http://www.w3.org/1999/xhtml"><head><title></title><meta content="0;url= Home.html" http-equiv="refresh"/></head><body></body></html>

将其读入Beautifulsoup可以正常工作.但是由于某种原因，这些功能都无法满足此特定需求，而且我也不明白为什么.在所有其他情况下，Beautifulsoup对我来说都非常有效.但是，在尝试时:

Reading it into Beautifulsoup works fine. However for some reason none of the functionality works for this specific senarious, and I don't understand why. Beautifulsoup has worked great for me in all other scenarios. However, when simply trying:

    soup.findAll('meta')

没有结果.

我最终的目标是跑步:

    soup.find("meta",attrs={"http-equiv":"refresh"})

但是如果:

    soup.findAll('meta')

什至没有工作，然后我被卡住了.任何对这个谜的煽动将不胜感激，谢谢！

isn't even working then I'm stuck. Any incite into this mystery would be appreciated, thanks!

推荐答案

是注释和文档类型将解析器以及随后的BeautifulSoup扔到这里.

It's the comment and doctype that throws the parser here, and subsequently, BeautifulSoup.

即使HTML标记似乎消失了":

Even the HTML tag seems 'gone':

>>> soup.find('html') is None
True

但是它仍然在.contents迭代中.您可以使用以下方法再次找到东西:

Yet it is there in the .contents iterable still. You can find things again with:

for elem in soup:
    if getattr(elem, 'name', None) == u'html':
        soup = elem
        break

soup.find_all('meta')

演示:

>>> for elem in soup:
...     if getattr(elem, 'name', None) == u'html':
...         soup = elem
...         break
... 
>>> soup.find_all('meta')
[<meta content="0;url= Home.html" http-equiv="refresh"/>]

这篇关于Beautifulsoup功能在特定情况下无法正常工作的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Beautifulsoup功能在特定情况下无法正常工作 [英] Beautifulsoup functionality not working properly in specific scenario

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Beautifulsoup功能在特定情况下无法正常工作 [英] Beautifulsoup functionality not working properly in specific scenario

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭