页面内容用JavaScript加载,Jsoup没有看到它 [英] Page content is loaded with javascript and Jsoup doesn't see it

查看:123
本文介绍了页面内容用JavaScript加载,Jsoup没有看到它的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

页面上的一个块被 javascript 填充内容,并且在用 Jsoup 加载页面后,没有任何那个信息。使用 Jsoup

One block on the page is filled with content by javascript and after loading page with Jsoup there is none of that inforamtion. Is there a way to get also javascript generated content when parsing page with Jsoup?

针对Marcin的特殊UPD:

无法在此处粘贴页面代码,因为它太长: http://pastebin.com/qw4Rfqgw

以下是我需要的内容元素:< div id ='tags_list'>< / div>

Here's element which content I need: <div id='tags_list'></div>

我需要用Java获取这些信息。 Preferebaly使用Jsoup。元素是借助 javascript

I need to get this information in Java. Preferebaly using Jsoup. Element is field with help of javascript:

<div id="tags_list">
    <a href="/tagsc0t20099.html" style="font-size:14;">разведчик</a>
    <a href="/tagsc0t1879.html" style="font-size:14;">Sr</a>
    <a href="/tagsc0t3140.html" style="font-size:14;">стратегический</a>
</div>

Java代码:

Java code:

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

import java.io.IOException;

public class Test
{
    public static void main( String[] args )
    {
        try
        {
            Document Doc = Jsoup.connect( "http://www.bestreferat.ru/referat-32558.html" ).get();
            Elements Tags = Doc.select( "#tags_list a" );

            for ( Element Tag : Tags )
            {
                System.out.println( Tag.text() );
            }
        }
        catch ( IOException e )
        {
            e.printStackTrace();
        }
    }
}


推荐答案

JSoup是一个 HTML 解析器,不是某种嵌入式浏览器引擎。这意味着它完全不知道在初始页面加载后通过Javascript添加到DOM的任何内容。

JSoup is an HTML parser, not some kind of embedded browser engine. This means that it's completely unaware of any content that is added to the DOM by Javascript after the initial page load.

要访问该类型的内容,您需要一个嵌入式浏览器组件,关于这种组件的SO有许多讨论,例如有没有办法在Java中嵌入浏览器?

To get access to that type of content you will need an embedded browser component, there are a number of discussions on SO regarding that kind of component, eg Is there a way to embed a browser in Java?

这篇关于页面内容用JavaScript加载,Jsoup没有看到它的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆