硒:通过一组元素迭代 [英] Selenium: Iterating through groups of elements

查看:149
本文介绍了硒:通过一组元素迭代的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我做这个用BeautifulSoup但它是一个有点麻烦,我试图找出如果我可以直接硒做到这一点。

I've done this with BeautifulSoup but it's a bit cumbersome, and I'm trying to figure out if I can do it directly with Selenium.

让我们说我有以下的HTML,其中重复多次在页面的源代码具有相同的元素,但不同的内容:

Let's say I have the following HTML, which repeats multiple times in the page source with identical elements but different contents:

<div class="person">
    <div class="title">
        <a href="http://www.url.com/johnsmith/">John Smith</a>
    </div>
    <div class="company">
        <a href="http://www.url.com/company/">SalesForce</a>
    </div>
</div>

我需要建立一个字典,其中每个人的入口如下:

I need to build a dictionary where the entry for each person looks like:

dict = {'name' : 'John Smith', 'company' : 'SalesForce'}

我可以轻松地获得硒做生产的每一个顶级元素的内容列表:

I can easily get Selenium to produce a list of the contents of each top level element by doing:

driver.find_elements_by_class_name('person')

但我不能遍历列表,因为上面的方法不缩小范围/源到该元素的内容之外。

But then I can't iterate through the list because the above method doesn't narrow the scope/source to just the contents of that element.

如果我尝试做这样的事情:

If I try to do something like this:

people = driver.find_elements_by_class_name('person')
for person in people:
    print person.find_element_by_xpath['//div[@class="title"]//a').text

我只是得到了相同的名字一遍又一遍。

I just get the same name over and over again.

我需要通过小组做这项工作组因为在我的情况下,通过整个页面迭代和附加每个变量是不行的(有无限滚动,所以这将是非常低效的)。

I need to do this group by group because in my case, iterating through the whole page and appending each tag individually won't work (there's infinite scrolling, so it would be really inefficient).

有谁知道它是否可以直接在硒做到这一点,如果又如何?

Does anyone know whether it's possible to do this directly in Selenium, and if so how?

推荐答案

使用<一个href=\"http://selenium-python.readthedocs.org/api.html#selenium.webdriver.remote.webdriver.WebDriver.find_elements_by_class_name\"><$c$c>find_elements_by_class_name()让所有的块和<一个href=\"http://selenium-python.readthedocs.org/api.html#selenium.webdriver.remote.webdriver.WebDriver.find_element_by_xpath\"><$c$c>find_element_by_xpath()要获得标题公司的每个人:

persons = []
for person in driver.find_elements_by_class_name('person'):
    title = person.find_element_by_xpath('.//div[@class="title"]/a').text
    company = person.find_element_by_xpath('.//div[@class="company"]/a').text

    persons.append({'title': title, 'company': company})

这篇关于硒:通过一组元素迭代的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆