美丽的汤发现元素有隐藏式的 [英] Beautiful Soup find elements having hidden style

查看:152
本文介绍了美丽的汤发现元素有隐藏式的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的简单需求。
如何找到不可见在目前的网页元素?我猜风格=能见度:隐藏或样式=显示:无。很简单的方法来隐藏一个元素,但BeautifulSoup不知道它隐藏或不

My simple need. How do I find elements that are not visible on the webpage currently? I am guessing style="visibility:hidden" or style="display:none" are simple ways to hide an element, but BeautifulSoup doesn't know if its hidden or not.

例如,HTML是...

For example, HTML is...

Textbox_Invisible1: <input id="tbi1" type="text" style="visibility:hidden">
Textbox_Invisible2: <input id="tbi2" type="text" class="hidden_elements">
Textbox1: <input id="tb1" type="text">

所以,我首先关心的是,BeautifulSoup查不到如果上述任何文本框都隐藏

So my first concern is that BeautifulSoup cannot find out if any of the above textboxes are hidden

# Python 2.7
# Import BeautifulSoup
>>> source = """Textbox_Invisible1: <input id="tbi1" type="text" style="visibility:hidden">
...  Textbox_Invisible2: <input id="tbi2" type="text" class="hidden_elements">
...  Textbox1: <input id="tb1" type="text">"""
>>> soup1 = BeautifulSoup(source)
>>> soup1.find(id='tb1').hidden
False
>>> soup1.find(id='tbi1').hidden
False
>>> soup1.find(id='tbi2').hidden
False
>>> 

我唯一的问题是,有没有办法找出哪些元素是隐藏的?
(我们也必须考虑复杂的HTML,其中含有的元素可能被隐藏)

My only question is, is there a way to find out which elements are hidden? (We have to consider the complex HTML also where the having elements might be hidden)

推荐答案

BeautifulSoup是一个的 HTML解析器的,而不是浏览器。它不知道的页面应该如何被渲染,计算DOM属性等等,它的检查,其中尖括号开始和结束。

BeautifulSoup is an html parser, not a browser. It doesn't know anything about how the page is supposed to be rendered, calculated DOM attributes etc, it's checking where the angle brackets begin and end.

如果您需要在运行时DOM的工作,你会用浏览器自动化软件包,即东西,这将启动浏览器,让浏览器消耗的页面更好,并且把浏览器的控制和计算的DOM 。根据不同的平台上,你有不同的选择。看看对Python的维基页面的想法,检查部分的 Python包装围绕Web库和浏览器技术

If you need to work with the DOM at runtime, you'd be better off with a browser automation package, i.e. something that will start the browser, let the browser consume the page, and then expose browser controls and the calculated DOM. Depending on the platform, you have different options. Have a look at this page on the Python WIki for ideas, check the section Python Wrappers around Web "Libraries" and Browser Technology.

这篇关于美丽的汤发现元素有隐藏式的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆