Java中的SGML解析器? [英] SGML parser in Java?

查看:158
本文介绍了Java中的SGML解析器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一个Java解析器,它可以解析以SGML格式化的文档。

对于重复的监视器:
我知道讨论这个主题的另外两个主题:
解析Java字符串使用SGML
Java SGML to XML转换?
但两者都没有分辨率,因此也没有新主题。

For duplicate monitors: I'm aware of the two other threads that discuss this topic: Parsing Java String with SGML Java SGML to XML conversion? But neither has a resolution, hence the new topic.

对于那些将XML与SGML混淆的人:
请阅读: http://www.w3.org/TR/NOTE-sgml-xml-971215# null
(简而言之,有足够的细微差别至少使其无法使用它的香草形式)

For people that confuse XML with SGML: Please read this: http://www.w3.org/TR/NOTE-sgml-xml-971215#null (in short, there are enough subtle differences to at least make it unusable in it's vanilla form)

对于喜欢它的人向海报询问谷歌:
我已经做过,最接近我能想到的是广受欢迎的SAXParser:< a href =http://download.oracle.com/javase/1.4.2/docs/api/javax/xml/parsers/SAXParser.html =nofollow noreferrer> http://download.oracle.com /javase/1.4.2/docs/api/javax/xml/parsers/SAXParser.html
但这当然是一个XML解析器。我正在四处查看是否有人实施了SAX Parser的修改以适应SGML。

For people who are fond of asking posters to Google it: I already did and the closest I could come up with was the widely popular SAXParser: http://download.oracle.com/javase/1.4.2/docs/api/javax/xml/parsers/SAXParser.html But that of course is meant to be an XML parser. I'm looking around to see if anyone has implemented a modification of the SAX Parser to accommodate SGML.

最后,我不能使用SX,因为我正在寻找Java解决方案。

Lastly, I cannot use SX as I'm looking for a Java solution.

谢谢! :)

推荐答案

我有几个解决这个问题的方法

I have a few approaches to this problem

第一个是你做的 - 检查sgml文档是否足够接近XML,以使标准SAX解析器工作。

The first is what you did -- check to see if the sgml document is close enough to XML for the standard SAX parser to work.

第二种是对HTML解析器做同样的事情。这里的技巧是找到一个不忽略非HTML元素的。

The second is to do the same with HTML parsers. The trick here is to find one that doesn't ignore non-HTML elements.

在搜索sgml解析器时,我确实找到了一些Java SGML解析器,更多的是在acedemia中Java的。我不知道它们的工作情况。

I did find some Java SGML parsers, more in acedemia, when searching for "sgml parser Java". I do not know how well they work.

最后一步是采用标准(非Java)SGML解析器并将文档转换为可以用Java读取的内容。

The last step is to take a standard (non Java) SGML parser and transform the documents into something you can read in Java.

看起来您可以使用第一步。

It looks like you were able to work with the first step.

这篇关于Java中的SGML解析器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆