使用 rvest 包中的 submit_form() 返回未更新的表单 [英] Using submit_form() from rvest package returns a form which is not updated
本文介绍了使用 rvest 包中的 submit_form() 返回未更新的表单的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
在使用 R(3.3.0 版)中的 rvest
包(0.3.1 版)将信息输入表单后,我试图从网站上抓取数据.下面是我的代码:
I am trying to scrape data from a website after entering information into a form using the rvest
package (version 0.3.1) in R (version 3.3.0). Below is my code:
# Load Packages
library(rvest)
# Specify URL
url <- "http://www.cocorahs.org/ViewData/ListDailyPrecipReports.aspx"
cocorahs <- html_session(url)
# Grab Initial Form
# Form is filled in stages. Here, only do country and date
form.unfilled <- cocorahs %>% html_node("form") %>% html_form()
form.filled <- form.unfilled %>%
set_values("frmPrecipReportSearch:ucStateCountyFilter:ddlCountry" = "840",
"frmPrecipReportSearch_ucDateRangeFilter_dcStartDate" = "6/15/2016",
"frmPrecipReportSearch_ucDateRangeFilter_dcEndDate" = "6/15/2016")
submit_form(cocorahs, form.filled,
submit="frmPrecipReportSearch:btnSearch") %>%
html_node("form") %>% html_form()
我期待结果显示更新的表单;当国家/地区更新为美国时,日期范围将恢复为默认值(访问日期).我缺少什么来确保表单更新该特定字段?
I was expecting the result to display the updated form; while the Country updated to the USA, the date range reverts back to the default (date of access). What am I missing to ensure the form updates that particular field?
推荐答案
我认为您在
"frmPrecipReportSearch:ucStateCountyFilter:ddlCountry" = "840"
当需要国家/地区名称时,您输入了一个数值.
You entered a numeric value when a country name was required.
查看下面的代码
# Load Packages
library(rvest)
# Specify URL
url <- "http://www.cocorahs.org/ViewData/ListDailyPrecipReports.aspx"
cocorahs <- html_session(url)
# Grab Initial Form
# Form is filled in stages. Here, only do country and date
form.unfilled <- cocorahs %>% html_node("form") %>% html_form()
form.filled <- form.unfilled %>%
set_values("frmPrecipReportSearch:ucStationTextFieldsFilter:tbTextFieldValue" = "840",
"frmPrecipReportSearch_ucDateRangeFilter_dcStartDate" = "6/15/2016",
"frmPrecipReportSearch_ucDateRangeFilter_dcEndDate" = "6/15/2016")
# submit the form and save as a new session
session <- submit_form(cocorahs, form.filled)
# look for a table in the nodes
table <- session %>% html_nodes("table")
# The table you want
table[[7]] %>% html_table()
这篇关于使用 rvest 包中的 submit_form() 返回未更新的表单的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文