如何忽略XML或HTML中未封闭的标签? [英] How to ignore unclosed tags in XML or HTML?

查看:482
本文介绍了如何忽略XML或HTML中未封闭的标签?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用包Text.XML和Text.XML.Cursor来为Haskell编写一个解析器。



有未封闭的标签并且出现错误:b
$ b


Main.hs:解析XML文件dat.html时出错:29:1-29:8:预期结束
元素for:Name {nameLocalName =br,nameNamespace = Nothing,
namePrefix = Nothing},但收到:EventEndElement(Name
{nameLocalName =body,nameNamespace = Nothing,namePrefix =
没有})


该怎么办?如何忽略这些标签?

解决方案

具有未关闭标签的文本对象不是 结构良好 因此不是XML。



因此,忘记使用任何XML库,解析器或工具。根据定义和设计,它们无法为您提供帮助。

您有两种选择。或者,


  1. 通过关闭未封闭的
    标签,修复文本对象的格式。您可以手动执行此操作,或尝试使用 TIDY

  2. 定义允许未封闭标签的新数据格式,并从头开始编写
    解析器。


I'm writing a parser in Haskell for the site using the packages Text.XML and Text.XML.Cursor.

There are unclosed tags and get an error:

Main.hs: Error parsing XML file dat.html: 29:1-29:8: Expected end element for: Name {nameLocalName = "br", nameNamespace = Nothing, namePrefix = Nothing}, but received: EventEndElement (Name {nameLocalName = "body", nameNamespace = Nothing, namePrefix = Nothing})

What to do? How to ignore such tags?

解决方案

A text object with unclosed tags is not well-formed and is therefore not XML.

So, forget about using any XML libraries, parsers, or tools. They are, by definition and design, not able to help you.

You have two options. Either,

  1. Repair the textual object to be well-formed by closing the unclosed tags. You might do this manually or try using TIDY, or
  2. Define a new data format that allows unclosed tags, and write a parser from the ground up for it.

这篇关于如何忽略XML或HTML中未封闭的标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆