For循环不适用于Web抓取python中的Google搜索 [英] For loop doesn't work for web scraping Google search in python

查看:60
本文介绍了For循环不适用于Web抓取python中的Google搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用关键字列表在网络上搜寻Google搜索.用于抓取单个页面的嵌套For循环效果很好.但是,列表中的另一个for循环搜索关键字不起作用,因为我打算针对每个搜索结果抓取数据.结果没有获得前两个关键字的搜索结果,但仅得到了最后一个关键字的结果.

I'm working on web-scraping Google search with a list of keywords. The nested For loop for scraping a single page works well. However, the other for loop searching keywords in the list does not work as I intended to which scrap the data for each searching result. The results didn't get the search outcome of the first two keywords, but it got only the result of the last keyword.

这是代码:

browser = webdriver.Chrome(r"C:\...\chromedriver.exe")

df = pd.DataFrame(columns = ['ceo', 'value'])

baseUrl = 'https://www.google.com/search?q='
ceo_list = ["Bill Gates", "Elon Musk", "Warren Buffet"]
values =[]


for ceo in ceo_list:
    browser.get(baseUrl + ceo)
    table = browser.find_elements_by_css_selector('div.ifM9O') 

    for row in table:
        ceo = str(([c.text for c in row.find_elements_by_css_selector('div.kno-ecr-pt.PZPZlf.gsmt.i8lZMc')])).strip('[]').strip("''")
        value = str(([c.text for c in row.find_elements_by_css_selector('div.Z1hOCe')])).strip('[]').strip("''")

    ceo = pd.Series(ceo) 
    value = pd.Series(value)

    df = df.assign(**{'ceo': ceo, 'value': value}) 


print(df)

browser.close()

这是输出:

              ceo                                              value
0  Warren Buffett  Born: August 30, 1930 (age 89 years), Omaha, N...

我期望的是:

              ceo                                              value
0  Bill Gates      Born:..........
1  Elon Musk       Born:...........
2  Warren Buffett  Born: August 30, 1930 (age 89 years), Omaha, N...

不确定哪一部分丢失了.

Not sure which part was missing.

推荐答案

您需要将ceo创建为列表,并将其附加到for循环内,以免覆盖它

You need to create ceo as a list and append to it inside the for loop so you don't keep overwriting it

这篇关于For循环不适用于Web抓取python中的Google搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆