Jsoup过帐修改过的文档 [英] Jsoup posting modified Document

查看:96
本文介绍了Jsoup过帐修改过的文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为即将到来的android应用创建网络抓取工具.因此,我需要在网站上使用一个简单的搜索表单,将其填写并将我的结果发送回服务器.

I'm trying to create a web scraper for my coming android app. Therefore I need to use a simple search form on a website, fill it out and send my results back to the server.

Jsoup-Cookbook 所述,我抓取了我需要的网站从服务器中更改值.

As mentioned in the Jsoup-Cookbook, I scraped the site I needed from the Server and changed the values.

现在,我只需要将修改后的文档发布回服务器并刮取结果页面即可. 据我在Jsoup-API中所见,除了Jsoup.connection中的.data-Attribute之外,没有其他方法可以回发内容,不幸的是,它无法通过其ID填写文本字段.

Now I just need to post my modified document back to the server and scrape the resulting page. As far as I've seen in the Jsoup-API there is no way to post something back, except with the .data-Attribute in Jsoup.connection, which is unfortunately not able to fill out text fields by their id.

任何想法或解决方法,如何将修改后的文档或其部分发布回网站?

Any ideas or workarounds, how to post the modified document, or its parts back to the website ?

推荐答案

您似乎误解了HTTP的工作原理.将具有修改后的输入值的 entire HTML文档从客户端发送到服务器是不正确的.更重要的是,所有输入元素的name = value对都作为请求参数发送.然后,服务器将返回所需的HTML响应.

You seem to misunderstand how HTTP works in general. It is not true that the entire HTML document with modified input values is been sent from the client to the server. It's more so that the name=value pairs of all input elements are been sent as request parameters. The server will return the desired HTML response then.

例如,如果您想在Jsoup中模拟以下表单的提交(您可以通过在浏览器中打开包含表单的页面并右键单击查看源代码,以找到确切的HTML表单语法. em>)

For example, if you want to simulate a submit of the following form in Jsoup (you can find the exact HTML form syntax by opening the page with the form in your browser and do a rightclick, View Source)

<form method="post" action="http://example.com/somescript">
    <input type="text" name="text1" />
    <input type="text" name="text2" />
    <input type="hidden" name="hidden1" value="hidden1value" />
    <input type="submit" name="button1" value="Submit" />
    <input type="submit" name="button2" value="Other button" />
</form>

然后,您需要按以下方式构造请求:

then you need to construct the request as follows:

Document document = Jsoup.connect("http://example.com/somescript")
    .data("text1", "yourText1Value") // Fill the first input field.
    .data("text2", "yourText2Value") // Fill the second input field.
    .data("hidden1", "hidden1value") // You need to keep it unmodified!
    .data("button1", "Submit")       // This way the server knows which button was pressed.
    .post();

// ...

在某些情况下,您还需要将会话cookie发送回去,但这是一个主题(和之前在这里已经多次询问过的问题;通常,为此使用真正的HTTP客户端会更容易并通过Jsoup#parse()传递其响应.

In some cases you'd also need to send the session cookies back, but that's a subject apart (and a question which has already been asked several times here before; in general, it's easier to use a real HTTP client for this and pass its response through Jsoup#parse()).

  • HTTP tutorial
  • HTTP specification

这篇关于Jsoup过帐修改过的文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆