如何删除没有属性的标签? [英] How can I remove tags which have no attributes?

查看:84
本文介绍了如何删除没有属性的标签?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大型HTML文档。它有数百个< span> s

没有属性所以这些< span> s是多余的。


如何自动删除这些标签?


该文件还有< span> s,其风格属性我不想b $ b想要删除。

解决方案

Oberon< ob **** @ solstice.com>写道:

我有一个大的HTML文档。它具有数百个< span> s,它们没有属性,因此这些< span> s是冗余的。


可能仍然使用没有属性的非语义元素,因为它们可以通过css选择器定位来设置
样式。

如何我可以自动删除这些标签吗?

该文件还具有< span> s,其中包含我不想删除的样式属性。




它们是否包含内容和/或其他标签?如果它们包含其他标签,

可以包含其他跨度吗?


-

Spartanicus


2005年5月28日星期六18:09:58 GMT,Spartanicus

< in ***** @ invalid.invalid>写道:

Oberon< ob **** @ solstice.com>写道:

我有一个大的HTML文档。它具有数百个< span>,它们没有属性,因此这些< span>是冗余的。



没有属性的非语义元素仍然可以使用,因为它们
可以通过css选择器定位来设置样式。




< span> s没有附加样式。他们根本没有

属性。我通常会将样式附加到元素上,例如

< td>,< p>和< body>等等,并且只有少量样式带有

文档。我不需要将样式附加到< span>本身

,因为那将是多余的(我已经将我的CSS引用了

到< td>,< p>,< body>等


我知道'这不是我应该做的,但它是我喜欢的

要做的事。

如何自动删除这些标签?

该文档还具有< span> s样式属性,我不是
想要删除。



它们是否包含内容和/或其他标签?如果它们包含其他标签,
可能包含其他跨度吗?



所有这些都有内容。我可以使用Dreamweaver的清理HTML

命令删除空标签。


其中一些< span>标签是嵌套的,但我不介意丢失

一些格式化(已经应用于嵌套< span>

带有样式属性的标签,如果它涉及摆脱

毫无价值的跨度。


是否有任何编辑器只能删除特定的

无属性标签?

这些巨大的文档,至少有40 KB的内容,通常是由MS Word生成的。
。我应用Dreamweaver的Clean up Word

HTML命令,然后进行大量编辑以删除几乎所有嵌入式样式的b $ b(用适用于
特定标签。完成后,95%以上的< span>标签没有

属性并且是多余的。我希望保留< span>标签

将包含符号或特殊格式。


我知道我应该将Word文件保存为文本,将其加载到

a空白HTML文件并手动重新格式化,但其中一些重新格式化可能很棘手。(用

相应的字符实体替换符号。所以如果有的话会很好

另一种选择。


Spartanicus写道:

Oberon< ob **** @ solstice.com>写道:

我有一个大的HTML文档。它有数百个< span> s
没有属性所以这些< span> s是多余的。
没有属性的非语义元素仍然可以使用




的确如此。但是想要删除它们并不是不合理的(除非

被痴迷)。


因为可以通过css选择器定位它们来设置样式。




但这样做是不合理的,因为值得摘要删除。

我该如何删除这些标签会自动生成吗?

该文件还有< span> s,其中包含我不想删除的样式属性。



它们是否包含内容和/或其他标签?如果它们包含其他标签,
可能包含其他跨度吗?




如果他们做(或可能做),那么你需要上下文信息删除

不需要的人。所以建立一个DOM并剥离它们。


如果它们从未嵌套,那么你可以用正则表达式来剥离它们,

或动态使用SAX解析器,例如mod_publisher。


-

Nick Kew


I have a large HTML document. It has hundreds of <span>s which
have no attributes so these <span>s are redundant.

How can I remove these tags automatically?

The document also has <span>s with style attributes that I don''t
want to remove.

解决方案

Oberon <ob****@solstice.com> wrote:

I have a large HTML document. It has hundreds of <span>s which
have no attributes so these <span>s are redundant.
Non semantic elements without attributes may still be used since they
can be styled by targeting them via css selectors.
How can I remove these tags automatically?

The document also has <span>s with style attributes that I don''t
want to remove.



Do they contain content and/or other tags? If they contain other tags,
could that include other spans?

--
Spartanicus


On Sat, 28 May 2005 18:09:58 GMT, Spartanicus
<in*****@invalid.invalid> wrote:

Oberon <ob****@solstice.com> wrote:

I have a large HTML document. It has hundreds of <span>s which
have no attributes so these <span>s are redundant.



Non semantic elements without attributes may still be used since they
can be styled by targeting them via css selectors.



The <span>s have no styles attached to them. They have no
attributes at all. I usually attach styles to elements such as
<td>, <p> and <body>, etc. and have only a few styles with a
document. I don''t need to attach a style to <span> itself
because that would be redundant (I already have my CSS referring
to <td>, <p>, <body>, etc.

I know that''s not what I''m supposed to do but it''s what I prefer
to do.

How can I remove these tags automatically?

The document also has <span>s with style attributes that I don''t
want to remove.



Do they contain content and/or other tags? If they contain other tags,
could that include other spans?



All of them have content. I can use Dreamweaver''s Clean up HTML
command to remove empty tags.

Some of these <span> tags are nested but I don''t mind losing
some of the formatting (which has been applied to nested <span>
tags with style attributes, if it involves getting rid of the
worthless spans.

Is there any editor that has a command to just remove specific
attribute-less tags?

These huge documents, with at least 40 Kbyte of content are
often produced by MS Word. I apply Dreamweaver''s Clean Up Word
HTML command, then do a fair amount of editing to remove nearly
all embedded styles (replacing them with styles applied to
specific tags. After doing that, 95%+ of the <span> tags have no
attributes and are redundant. The <span> tags I want to keep
will either enclose symbols or special formatting.

I know I should really save the Word file as text, load it into
a blank HTML file and reformat it by hand, but some of that
reformatting can be tricky. (replacing symbols with the
appropriate character entity. So it would be nice if there was
an alternative.


Spartanicus wrote:

Oberon <ob****@solstice.com> wrote:

I have a large HTML document. It has hundreds of <span>s which
have no attributes so these <span>s are redundant.
Non semantic elements without attributes may still be used



Indeed. But wanting to remove them is not unreasonable (unless
taken to obsession).

since they can be styled by targeting them via css selectors.



But that would be such bad practice as to merit summary removal.

How can I remove these tags automatically?

The document also has <span>s with style attributes that I don''t
want to remove.



Do they contain content and/or other tags? If they contain other tags,
could that include other spans?



If they do (or may do), then you''d need context information to remove
the unwanted ones. So build a DOM and strip them.

If they''re never nested, then you can just strip them with a regexp,
or on the fly with a SAX parser such as mod_publisher.

--
Nick Kew


这篇关于如何删除没有属性的标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆