Java HTML解析器 [英] Java html parser

查看:252
本文介绍了Java HTML解析器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好,我获得了这段代码,这些代码可以帮助我解析html网页并打印所有< img>标签排成一行.但是,我的代码有一个小问题.

Hi all, i have this code that i acquired help to parse a html web page and print all the <img> tag lines out.However, i''m having a slight issue with the codes.

import javax.swing.text.html.*;
import javax.swing.text.Element;
import javax.swing.text.ElementIterator;
import java.net.URL;
import java.io.InputStreamReader;
import java.io.Reader;

/**
 *  Extract all "img" tags from an HTML document.
 */
public class HTMLParser
{
  public static void main( String[] argv ) throws Exception
  {
    URL url = new URL( "http://java.sun.com" ); 
    HTMLEditorKit kit = new HTMLEditorKit(); 
    HTMLDocument doc = (HTMLDocument) kit.createDefaultDocument(); 
    doc.putProperty("IgnoreCharsetDirective", Boolean.TRUE);
    Reader HTMLReader = new InputStreamReader(url.openConnection().getInputStream()); 
    kit.read(HTMLReader, doc, 0); 

    //  Get an iterator for all HTML tags.
    ElementIterator it = new ElementIterator(doc); 
    Element elem; 
    
    while( elem = it.next() != null  )
    { 
      if( elem.getName().equals(  "img") )
      { 
        String s = (String) elem.getAttributes().getAttribute(HTML.Attribute.SRC);
        if( s != null ) 
          System.out.println (s );
      } 
    }
    System.exit(0);
  }
} 



它在行



it keeps showing an error on the line

while( elem = it.next() != null  )

上一直显示错误,指出类型不兼容.任何人都可以帮忙吗?提前thx.

saying incompatible types. Anyone can help on this? thx in advance.

推荐答案

我遇到2个错误:

I get 2 errors:

Type mismatch: cannot convert from boolean to Element
Type mismatch: cannot convert from Element to boolean



都在那条线上.现在该怎么办?

好吧,这样初始化对象并同时比较其中的值被认为是不好的风格.

因此,让我们对其进行更改:



both at that line. Now what to do?

Well, it''s considered to be bad style to init Objects like that and to compare there value at the same time.

so let''s change it:

while(true) // ever lasting...
    { 
       elem = it.next();
      if(elem == null) break; //..until the end is near.
      if( elem.getName().equals(  "img") )
      { 
        String s = (String) elem.getAttributes().getAttribute(HTML.Attribute.SRC);
        if( s != null ) 
          System.out.println (s );
      } 
    }


这篇关于Java HTML解析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆