Selenium-driver.getPageSource()与从浏览器查看的源不同 [英] Selenium - driver.getPageSource() differs than the source viewed from browser

查看：742 发布时间：2020/11/8 3:58:47 java firefox selenium webdriver selenium-webdriver

本文介绍了Selenium-driver.getPageSource()与从浏览器查看的源不同的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用Selenium从指定的URL捕获的源代码到HTML文件中，但是我不知道为什么，我没有从浏览器中看到确切的源代码.

I am trying to capture the source code from the URL specified into an HTML file using selenium, but I don't know why, I am not getting the exact source code which we see from the browser.

下面是我的Java代码，用于捕获HTML文件中的源代码

Below is my java code to capture the source in an HTML file

private static void getHTMLSourceFromURL(String url, String fileName) {

    WebDriver driver = new FirefoxDriver();
    driver.get(url);

    try {
        Thread.sleep(5000);   //the page gets loaded completely

        List<String> pageSource = new ArrayList<String>(Arrays.asList(driver.getPageSource().split("\n")));

        writeTextToFile(pageSource, originalFile);

    } catch (InterruptedException e) {
        e.printStackTrace();
    }

    System.out.println("quitting webdriver");
    driver.quit();
}

/**
 * creates file with fileName and writes the content
 * 
 * @param content
 * @param fileName
 */
private static void writeTextToFile(List<String> content, String fileName) {
    PrintWriter pw = null;
    String outputFolder = ".";
    File output = null;
    try {
        File dir = new File(outputFolder + '/' + "HTML Sources");
        if (!dir.exists()) {
            boolean success = dir.mkdirs();
            if (success == false) {
                try {
                    throw new Exception(dir + " could not be created");
                } catch (Exception e) {
                    e.printStackTrace();
                }
            }
        }

        output = new File(dir + "/" + fileName);
        if (!output.exists()) {
            try {
                output.createNewFile();
            } catch (IOException ioe) {
                ioe.printStackTrace();
            }
        }
        pw = new PrintWriter(new FileWriter(output, true));
        for (String line : content) {
            pw.print(line);
            pw.print("\n");
        }
    } catch (IOException ioe) {
        ioe.printStackTrace();
    } finally {
        pw.close();
    }

}

有人可以对此有所解释吗? WebDriver如何呈现页面?浏览器如何显示源?

Can someone throw some light into this as to why this happens? How WebDriver renders the page? And how browser shows the source?

Selenium-driver.getPageSource()与从浏览器查看的源不同 [英] Selenium - driver.getPageSource() differs than the source viewed from browser

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

Selenium-driver.getPageSource()与从浏览器查看的源不同 [英] Selenium - driver.getPageSource() differs than the source viewed from browser

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭