Java HTMLUnit getByFirstXPath无法正常工作 [英] Java HTMLUnit getByFirstXPath not working

查看:59
本文介绍了Java HTMLUnit getByFirstXPath无法正常工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Java中的HTMLUnit 2.4加载页面并单击链接.我正在尝试通过xPath在页面上查找日历.它返回null.xPath是直接从Chrome DevTools复制出来的.我使用了Chrome DevTools的网络"标签,看到加载到页面的第一个资源包含了我想要的数据.我不认为这与AJAX有关.

I am trying to load a page and click a link using HTMLUnit 2.4 in Java. I am attempting to find the calendar on a page by xPath. It is returning null. The xPath was copied directly out of Chrome DevTools. I used Chrome DevTools Network tab to see that the very first resource loaded to the page contains the data I want. I don't think this is AJAX related.

import com.gargoylesoftware.htmlunit.BrowserVersion;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlDivision;
import com.gargoylesoftware.htmlunit.html.HtmlElement;
import com.gargoylesoftware.htmlunit.html.HtmlPage;

import java.io.IOException;

public class App {
    public static void main( String[] args ) throws IOException {
        final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_3);
        webClient.setJavaScriptEnabled(false);
        HtmlPage homePage = webClient.getPage("http://bay.realtaxdeed.com");
        webClient.setJavaScriptEnabled(true);
        HtmlDivision calendarButtonDiv = homePage.getFirstByXPath("//*[@id=\"splashMenuBottom\"]");
        HtmlPage currentMonthPage = calendarButtonDiv.click();

        HtmlElement element = currentMonthPage.getElementById("MAIN_TBL_CONTENT");
        HtmlDivision calendarDivision = currentMonthPage.getFirstByXPath("//*[@id=\"MAIN_TBL_CONTENT\"]/div[2]/div/div[11]");

        System.out.println( "Run complete." );
    }
}

为了避免在目标网页上出现错误,我不得不禁用了javascript.我重新启用尝试以确保WebClient的性能像普通浏览器一样.我通过xpath成功获取了calendarButtonDiv.我单击该按钮以获取带有日历的页面.我的目标是单击其中包含超链接的日期.

I had to disable the javascript to avoid getting an error on the landing page. I re-enable to try and ensure the WebClient performs like a normal browser. I successfully get the calendarButtonDiv by xpath. I click that to get the page with the calendar. My goal is to click the days with hyperlinks in them.

为什么我找不到xPath的日历(calendarDivision)?使用HTMLUnit查找元素的正确方法是什么?

Why can't I find the calendar (calendarDivision) by xPath? What is the proper way to find elements using HTMLUnit?

推荐答案

您使用的HtmlUnit版本现在已经9年了.请尝试使用最新版本.通常,这类问题与不同的dom树有关,并且实际浏览器以及HtmlUnit都在其中进行了许多更改/修复.

The HtmlUnit version you are using is now 9 years old. Please try this with the latest version. Usually this kind of problems are related to different dom trees and there where many changes/fixes done by the reals browsers and also in HtmlUnit.

下一步是从HtmlUnit(page.asXML())转储页面,并查看XPath是否合适.

Next step is to dump the page from HtmlUnit (page.asXML()) and have a look if your XPath does fit.

最后,如果您认为HtmlUnit(最新版本)生成的dom树与实际的浏览器之一不同;请打开一个问题,并提供一个简单的html来显示问题.通常,我们然后可以快速解决此类问题.

And finally if you think the dom tree generated by HtmlUnit (latest version) differs from the one of real browsers; please open an issue and provide a simple html that shows the problem. Usually we can then fix this kind of problems fast.

这篇关于Java HTMLUnit getByFirstXPath无法正常工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆