如何从谷歌抓取第一个链接的描述? [英] How to scrape description of first link from Google?
本文介绍了如何从谷歌抓取第一个链接的描述?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我关注了:如何获取谷歌搜索结果
它可以工作,但我想抓取 Google 返回的第一个链接的描述.对于 CRAN
关键字,它是:
and it works, but I would like to scrape the description of the first link that Google returns. For a CRAN
keyword it is :
<span class="st"><em>CRAN</em> is a network of ftp and web servers around the world that store identical, up-to-date, versions of code and documentation for R. Please use the <em>CRAN</em> ...</span>
但我不知道这里的 span
部分是什么,请提供不使用 RSelenium
but I don't know what is span
section here, please provide solution without using RSelenium
推荐答案
使用rvest
:
library(rvest)
baseUrl <- 'https://www.google.it/search?q='
query = 'cran'
url <- paste0(baseUrl, query)
read_html(url) %>%
html_nodes('.st') %>%
# This select only the first result, change number to select another reusult
# or comment it to get all first page results
'['(2) %>%
html_text()
这篇关于如何从谷歌抓取第一个链接的描述?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文