如何在HTML文件中搜索某些标签? [英] How to search in a HTML file for some tags?

查看:173
本文介绍了如何在HTML文件中搜索某些标签?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Java方面遇到了一些问题。
如何执行此操作:我想在HTML文件中搜索标签href和src,然后我想获取与该标签关联的URL。

I'm having a little problem in Java. How to do this: I want to search in a HTML file for the tags href and src, and then I want to get the URL associated with that tags.

最好的方法是什么?

感谢您的帮助。
祝你好运。

Thanks for the help. Best regards.

推荐答案

这是我用来完成你想要做的事情的代码,但是首先让我给你一些提示。

This is the code I used to accomplish exactly what you'd like to do, but first let me give you a few tips.

如果您在Java Swing环境中,请确保使用javax.swing.text.html和javax.swing.text.html.parser中的方法包。不幸的是,它们主要用于JEditorPane,但我仍然强烈建议你看看这些。

If you're in a Java Swing environment, make sure to use the methods in the javax.swing.text.html and javax.swing.text.html.parser packages. Unfortunately, they're mostly intended for use on a JEditorPane, but I'd still strongly recommend that you take a look at these.

Java 6中有一个类名为HTML.Tag的API,用于标识HTML开始和结束标记,然后您可以使用它来确定您希望程序遵循的链接的位置。 http://java.sun .com / javase / 6 / docs / api / javax / swing / text / html / HTML.Tag.html

There's a class in the Java 6 API called HTML.Tag that identifies the HTML start and end tags, which you can then use in order to determine where the links are that you'd like your program to follow.http://java.sun.com/javase/6/docs/api/javax/swing/text/html/HTML.Tag.html

当我写一个非常相似的程序时对此,我使用了3种主要方法:

When I wrote a program very similar to this, I used 3 main methods:

public void handleStartTag(HTML.Tag t, MUtableAttributeSet atts, int pos)
public void handleEndTag(HTML.Tag t, int pos)
public void handleText(char[] text, int pos)

如果你需要更多关于如何编写这些方法的帮助,你可以给我留言,但基本上,你是寻找一个初始标签和一个结束标签,然后你就可以确定该网址了,然后你可以继续进行下一步,即跟踪网址。

If you need more help on how to write these methods, you can message me, but basically, you are looking for an initial tag and an end tag and then from that you will have identified the url and then you can proceed to the next step, which is following the url.

要关注网址,我建议您使用JEditorPane对象。 javax.swing.event.HyperlinkListener接口只定义了一个方法,hyperlinkUpdate(HyperlinkEvent e),您可以将url传入,然后在JEditorPane对象上调用.setPage(evt.getURL())。然后,这将使用新页面更新窗格,并允许您再次启动该过程。

To follow the url, I advise you to use the JEditorPane object. The javax.swing.event.HyperlinkListener interface defines only one method, hyperlinkUpdate(HyperlinkEvent e), which you can pass the url into and then call .setPage(evt.getURL()) on your JEditorPane object. This will then update the pane with the new page and allow you to start the process again.

如果您有任何问题,请告诉我,请投票给我答案!

Msg me if you have any probs and please vote this answer!

这篇关于如何在HTML文件中搜索某些标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆