rvest:当输入没有名称时如何提交表单? [英] rvest: how to submit form when input doesn't have a name?

查看:38
本文介绍了rvest:当输入没有名称时如何提交表单?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个简单的问题,我不知道如何解决.我想使用 rvest 填写表单,其中输入没有名称:

I have a simple problem and I don't know how to settle it. I want to fill a form using rvest where the input have no name:

library(rvest)
session <- html_session("https://www.tripadvisor.com/")
pgform <- html_form(session)[[1]]

> pgform
<form> 'global_nav_search_form' (GET /Search)
  <input search> '': 
  <input text> '': 
  <button submit> 'sub-search
  <input hidden> 'geo': 1
  <input hidden> 'latitude': 
  <input hidden> 'longitude': 
  <input hidden> 'searchNearby': 
  <input hidden> 'pid': 3826
  <input hidden> 'redirect': 
  <input hidden> 'startTime': 
  <input hidden> 'uiOrigin': 
  <input hidden> 'q': 
  <input hidden> 'supportedSearchTypes': find_near_stand_alone_query
  <input hidden> 'enableNearPage': true
  <input hidden> 'returnTo': __2F__
  <input hidden> 'searchSessionId': C9C09F9043AE6FE69CE679DF8A44546D1547136702473ssid
  <input hidden> 'social_typeahead_2018_feature': true

这里我想通过设置输入文本来进行搜索,以获得页面的链接.当然如果我这样做

Here I would like to do a search by setting the input text, to have the link of the page. Of course if I do

filledform <- set_values(pgform, '' = "Paris")

我有一个错误:

Error: attempt to use zero-length variable name

我确信有一个简单的解决方法,但我不知道.有什么想法吗?

I am sure there is a simple workaround, but I don't know it. Any idea ?

推荐答案

修改空字段

您可以使用字段的索引直接访问和修改名称为空的字段,例如:

You can access and modify a field with an empty name directly by using the field's index, for example like this:

pgform$fields[[2]]$value <- 'Paris'

如果你想通过它的类型动态地找到该字段的索引,你可以这样做:

If you want to find the index of the field dynamically by its type, you could do that like this:

for (i in 1:length(pgform$fields))
    if (is.null(pgform$fields[[i]]$name) && pgform$fields[[i]]$type == 'text')
        pgform$fields[[i]]$value <- 'Paris'

您的具体问题

对于您的特定网站,上述内容不会给您预期的结果.您需要修改以提交查询的字段是 q,因此您需要执行以下操作:

For your specific website, the above will not give you the expected results. The field you need to modify to submit a query is q, so you would want to do something like this:

session <- html_session('https://www.tripadvisor.com/')
pgform <- html_form(session)[[1]]
pgform <- set_values(pgform, q = 'Paris')
result <- submit_form(session, pgform)

这将为您加载所需的页面,但不会为您提供您可能正在查找的内容,因为该内容只能由浏览器使用 XMLHttpRequest 动态加载.要获取内容,您需要执行以下操作:

This will load the desired page for you but will not provide you with the content you are probably looking for, as that content would only be loaded dynamically by the browser using a XMLHttpRequest. To also get the content you would instead need to do something like this:

session <- html_session('https://www.tripadvisor.com/')
pgform <- html_form(session)[[1]]
pgform <- set_values(pgform, q = 'Paris')
result <- submit_form(session, pgform, submit = NULL, httr::add_headers('x-requested-with' = 'XMLHttpRequest'))

这将为您提供没有周围页面结构的内容.

That will give you the content without the surrounding page structure.

这篇关于rvest:当输入没有名称时如何提交表单?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆