使用 rvest 包中的 submit_form() 返回未更新的表单 [英] Using submit_form() from rvest package returns a form which is not updated

查看:28
本文介绍了使用 rvest 包中的 submit_form() 返回未更新的表单的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在使用 R(3.3.0 版)中的 rvest 包(0.3.1 版)将信息输入表单后,我试图从网站上抓取数据.下面是我的代码:

I am trying to scrape data from a website after entering information into a form using the rvest package (version 0.3.1) in R (version 3.3.0). Below is my code:

# Load Packages
library(rvest)

# Specify URL
url <- "http://www.cocorahs.org/ViewData/ListDailyPrecipReports.aspx"
cocorahs <- html_session(url)

# Grab Initial Form
#  Form is filled in stages. Here, only do country and date
form.unfilled <- cocorahs %>% html_node("form") %>% html_form()
form.filled <- form.unfilled %>% 
  set_values("frmPrecipReportSearch:ucStateCountyFilter:ddlCountry" = "840",
             "frmPrecipReportSearch_ucDateRangeFilter_dcStartDate" = "6/15/2016",
             "frmPrecipReportSearch_ucDateRangeFilter_dcEndDate" = "6/15/2016")

submit_form(cocorahs, form.filled,
            submit="frmPrecipReportSearch:btnSearch") %>%
  html_node("form") %>% html_form()

我期待结果显示更新的表单;当国家/地区更新为美国时,日期范围将恢复为默认值(访问日期).我缺少什么来确保表单更新该特定字段?

I was expecting the result to display the updated form; while the Country updated to the USA, the date range reverts back to the default (date of access). What am I missing to ensure the form updates that particular field?

推荐答案

我认为您在

"frmPrecipReportSearch:ucStateCountyFilter:ddlCountry" = "840"

当需要国家/地区名称时,您输入了一个数值.

You entered a numeric value when a country name was required.

查看下面的代码

# Load Packages
library(rvest)

# Specify URL
url <- "http://www.cocorahs.org/ViewData/ListDailyPrecipReports.aspx"
cocorahs <- html_session(url)

# Grab Initial Form
#  Form is filled in stages. Here, only do country and date
form.unfilled <- cocorahs %>% html_node("form") %>% html_form()
form.filled <- form.unfilled %>%
set_values("frmPrecipReportSearch:ucStationTextFieldsFilter:tbTextFieldValue" = "840",
         "frmPrecipReportSearch_ucDateRangeFilter_dcStartDate" = "6/15/2016",
         "frmPrecipReportSearch_ucDateRangeFilter_dcEndDate" = "6/15/2016")

# submit the form and save as a new session
session <- submit_form(cocorahs, form.filled) 

# look for a table in the nodes
table <- session %>% html_nodes("table")

# The table you want
table[[7]] %>% html_table()

这篇关于使用 rvest 包中的 submit_form() 返回未更新的表单的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆