使用 rvest 添加新字段以形成表单 [英] add new field to form with rvest

查看:45
本文介绍了使用 rvest 添加新字段以形成表单的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 rvest 下载 [完整的] 动态扩展的 [持仓] 表,但遇到了 Unknown field names 错误.

I'm trying to download [the full] dynamically expanded [holdings] table using rvest, but am getting an Unknown field names error.

s <- html_session("http://innovatoretfs.com/etf/?ticker=ffty")
f <- html_form(s)[[1]]
#the following line fails:
f.new <- set_values(f, `__EVENTTARGET` = "ctl00$BodyPlaceHolder$ViewHoldingsLinkButton")

##subsequent lines are not tested##
doc <- submit_form(s, f.new)
tabs <- xml_find_all(doc, "//table")
holdings <- html_table(tabs, fill = T, trim = T)[[5]]

我对 HTML/HTTP 不是很好,但从我可以完成的事情来看,在我看来,扩展表格需要使用这个新字段值集回发表单

I'm not great with HTML/HTTP but from what i can chase through, it seems to me that to expand the table requires a postback of the form with this new field value set

在检查 set_values 函数后,它似乎只允许为现有字段赋值.

after inspecting the set_values function, it seems that it only allows existing fields to be assigned values.

有什么方法可以在 rvest 下的表单中添加新字段?如果没有,是否有人知道我可以用来获得此功能的另一个包?

is there any way to add a new field to a form under rvest? If not, is anyone ware of another package I could use to get this functionality?

非常明确,我需要动态扩展表的完整版本并添加预期的后续表提取代码

[edited] to be very explicit that i need the full version of the dynamically expanded table and to add expected subsequent table extraction code

推荐答案

令人厌恶,但有效 可能会被清理,但会向项目提交问题以正确修复 add_values 类型的功能

DISGUSTING, BUT WORKS could probably be cleaned up, but will submit an issue to the project for a proper fix for add_values type functionality

getInnovatorHoldings <- function() {
    s <- html_session("http://innovatoretfs.com/etf/?ticker=ffty")
    f <- html_form(s)[[1]]
    f.new <- add_values(f,
                            `__EVENTTARGET` = "ctl00$BodyPlaceHolder$ViewHoldingsLinkButton",
                            `__EVENTARGUMENT` = "",
                            `submit` = NULL)

    s <- submit_form(s, f.new, "submit")
    doc <- read_html(s)
    tabs <- xml_find_all(doc, "//table")
    holdings <- html_table(tabs, fill = T, trim = T)[[5]]
    return(holdings)
}

add_values <- function(form, ...) {
    new_values <- list(...)
    no_match <- which(!names(new_values) %in% names(form$fields))
    for (n in no_match) {
        if (names(new_values[n]) == "submit") {
            form$fields[[names(new_values[n])]] <- new_input(name = names(new_values[n]), type = "submit", value = NULL)
        } else {
            form$fields[[names(new_values[n])]] <- new_input(name = names(new_values[n]), type = "hidden", value = new_values[n][[1]])
        }
    }
    return(form)
}

new_input <- function(name, type, value, checked = NULL, disabled = NULL, readonly = NULL, required = F) {
    return(
        structure(
            list(name = name,
                type = type,
                value = value,
                checked = checked,
                disabled = disabled,
                readonly = readonly,
                required = required
                ),
            class = "input"
        )
    )
}

这篇关于使用 rvest 添加新字段以形成表单的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆