使用HtmlUnit搜索Google [英] Use HtmlUnit to search google
问题描述
以下代码是尝试搜索google并以文本或html格式返回结果的尝试. 该代码几乎完全是直接从在线代码段中直接复制的,我认为没有理由不从搜索中返回结果.在没有浏览器的情况下,如何使用htmlunit提交搜索查询来返回Google搜索结果?
The following code is an attempt to search google, and return the results as text or html. The code was almost entirely copied directly from code snippets online, and i see no reason for it to not return results from the search. How do you return google search results, using htmlunit to submit the search query, without a browser?
import com.gargoylesoftware.htmlunit.WebClient;
import java.io.*;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import com.gargoylesoftware.htmlunit.html.HtmlInput;
import com.gargoylesoftware.htmlunit.html.HtmlSubmitInput;
import java.net.*;
public class GoogleSearch {
public static void main(String[] args)throws IOException, MalformedURLException
{
final WebClient webClient = new WebClient();
HtmlPage page1 = webClient.getPage("http://www.google.com");
HtmlInput input1 = page1.getElementByName("q");
input1.setValueAttribute("yarn");
HtmlSubmitInput submit1 = page1.getElementByName("btnK");
page1=submit1.click();
System.out.println(page1.asXml());
webClient.closeAllWindows();
}
}
推荐答案
必须进行一些浏览器检测才能更改生成的HTML,因为使用page1.getWebResponse().getContentAsString()
检查HTML时,提交按钮的名称为btnG
而不是btnK
(这不是我在Firefox中观察到的).进行更改,结果将是预期的结果.
There must be some browser detection that changes the generated HTML, because when inspecting the HTML with page1.getWebResponse().getContentAsString()
, the submit button is named btnG
and not btnK
(which is not what I observe in Firefox). Make this change, and the result will be the expected one.
这篇关于使用HtmlUnit搜索Google的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!