从sql server中的字符串去除html标签的最佳方法? [英] Best way to strip html tags from a string in sql server?

查看:1013
本文介绍了从sql server中的字符串去除html标签的最佳方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在SQL Server 2005中包含了包含html标签的数据,我想将所有这些都删除,只留下标签之间的文本。理想情况下,也可以用< 等替换& lt; 之类的东西。



有没有简单的方法来做到这一点,或有人已经得到了一些示例t-sql代码?



我没有能力添加扩展存储过程等,所以更喜欢一个纯粹的t-sql方法(最好是向后兼容sql 2000)。

我只是想剥离数据out html,不更新它,所以理想情况下它会被编写为用户定义的函数,以便于重用。



例如,转换这个:

 < B>一些有用的文字< / B>& nbsp; 
href =http://there.com/3ce984e88d0531bac5349
target = globalhelp>
来源说明<br/ src =/ ri / new_info.gifwidth = 15 align = top border = 0>
< / A>& gt;& nbsp;< b>更多文字< / b>< / TD>< / TR> b




一些有用的文字>更多文本


在这里:



用户定义函数去除HTML

  CREATE FUNCTION [dbo]。[udf_StripHTML](@HTMLText VARCHAR(MAX))
RETURNS VARCHAR(MAX)AS
BEGIN
DECLARE @Start INT
DECLARE @End INT
DECLARE @Length INT
SET @Start = CHARINDEX('<',@ HTMLText)
SET @End = CHARINDEX('>',@ HTMLText,CHARINDEX ('<',@ HTMLText))
SET @Length =(@End - @Start)+ 1
WHILE @Start> 0 AND @End> 0 AND @Length> 0
BEGIN
SET @HTMLText = STUFF(@ HTMLText,@ Start,@ Length,'')
SET @Start = CHARINDEX('<',@ HTMLText)
SET @End = CHARINDEX('>',@ HTMLText,CHARINDEX('<',@ HTMLText))
SET @Length =(@End - @Start)+ 1
END
RETURN LTRIM(RTRIM(@HTMLText))
END
GO

编辑:注意这是针对SQL Server 2005的,但是如果将关键字MAX更改为4000,它也可以在SQL Server 2000中使用。


I've got data in SQL Server 2005 that contains html tags and I'd like to strip all that out, leaving just the text between the tags. Ideally also replacing things like &lt; with <, etc.

Is there an easy way to do this or has someone already got some sample t-sql code?

I don't have the ability to add extended stored procs and the like, so would prefer a pure t-sql approach (preferably one backwards compatible with sql 2000).

I just want to retrieve the data with stripped out html, not update it, so ideally it would be written as a user-defined function, to make for easy reuse.

So for example converting this:

<B>Some useful text</B>&nbsp;
<A onclick="return openInfo(this)"
   href="http://there.com/3ce984e88d0531bac5349"
   target=globalhelp>
   <IMG title="Source Description" height=15 alt="Source Description" 
        src="/ri/new_info.gif" width=15 align=top border=0>
</A>&gt;&nbsp;<b>more text</b></TD></TR>

to this:

Some useful text > more text

解决方案

There is a UDF that will do that described here:

User Defined Function to Strip HTML

CREATE FUNCTION [dbo].[udf_StripHTML] (@HTMLText VARCHAR(MAX))
RETURNS VARCHAR(MAX) AS
BEGIN
    DECLARE @Start INT
    DECLARE @End INT
    DECLARE @Length INT
    SET @Start = CHARINDEX('<',@HTMLText)
    SET @End = CHARINDEX('>',@HTMLText,CHARINDEX('<',@HTMLText))
    SET @Length = (@End - @Start) + 1
    WHILE @Start > 0 AND @End > 0 AND @Length > 0
    BEGIN
        SET @HTMLText = STUFF(@HTMLText,@Start,@Length,'')
        SET @Start = CHARINDEX('<',@HTMLText)
        SET @End = CHARINDEX('>',@HTMLText,CHARINDEX('<',@HTMLText))
        SET @Length = (@End - @Start) + 1
    END
    RETURN LTRIM(RTRIM(@HTMLText))
END
GO

Edit: note this is for SQL Server 2005, but if you change the keyword MAX to something like 4000, it will work in SQL Server 2000 as well.

这篇关于从sql server中的字符串去除html标签的最佳方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆