使用HtmlUnit搜索Google [英] Use HtmlUnit to search google

查看:112
本文介绍了使用HtmlUnit搜索Google的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下代码是尝试搜索google并以文本或html格式返回结果的尝试. 该代码几乎完全是直接从在线代码段中直接复制的,我认为没有理由不从搜索中返回结果.在没有浏览器的情况下,如何使用htmlunit提交搜索查询来返回Google搜索结果?

The following code is an attempt to search google, and return the results as text or html. The code was almost entirely copied directly from code snippets online, and i see no reason for it to not return results from the search. How do you return google search results, using htmlunit to submit the search query, without a browser?

      import com.gargoylesoftware.htmlunit.WebClient;
      import java.io.*;
      import com.gargoylesoftware.htmlunit.html.HtmlPage;    
      import com.gargoylesoftware.htmlunit.html.HtmlInput;
      import com.gargoylesoftware.htmlunit.html.HtmlSubmitInput;


      import java.net.*;

       public class GoogleSearch {

      public static void main(String[] args)throws IOException, MalformedURLException
      {
        final WebClient webClient = new WebClient();

        HtmlPage page1 = webClient.getPage("http://www.google.com");
        HtmlInput input1 = page1.getElementByName("q");
        input1.setValueAttribute("yarn");

        HtmlSubmitInput submit1 = page1.getElementByName("btnK");

        page1=submit1.click();

        System.out.println(page1.asXml()); 

        webClient.closeAllWindows();
      }
    } 

推荐答案

必须进行一些浏览器检测才能更改生成的HTML,因为使用page1.getWebResponse().getContentAsString()检查HTML时,提交按钮的名称为btnG而不是btnK(这不是我在Firefox中观察到的).进行更改,结果将是预期的结果.

There must be some browser detection that changes the generated HTML, because when inspecting the HTML with page1.getWebResponse().getContentAsString(), the submit button is named btnG and not btnK (which is not what I observe in Firefox). Make this change, and the result will be the expected one.

这篇关于使用HtmlUnit搜索Google的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆