如何使用 C# 验证字符串不包含 HTML [英] How to validate that a string doesn't contain HTML using C#

查看:20
本文介绍了如何使用 C# 验证字符串不包含 HTML的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有人有一种简单有效的方法来检查字符串是否不包含 HTML?基本上,我想检查某些字段是否仅包含纯文本.我想过寻找 <字符,但可以很容易地在纯文本中使用.另一种方法可能是使用以下方法创建一个新的 System.Xml.Linq.XElement:

Does anyone have a simple, efficient way of checking that a string doesn't contain HTML? Basically, I want to check that certain fields only contain plain text. I thought about looking for the < character, but that can easily be used in plain text. Another way might be to create a new System.Xml.Linq.XElement using:

XElement.Parse("<wrapper>" + MyString + "</wrapper>")

并检查 XElement 是否不包含子元素,但这对于我所需要的来说似乎有点重量级.

and check that the XElement contains no child elements, but this seems a little heavyweight for what I need.

推荐答案

我刚刚尝试了我的 XElement.Parse 解决方案.我在字符串类上创建了一个扩展方法,以便我可以轻松地重用代码:

I just tried my XElement.Parse solution. I created an extension method on the string class so I can reuse the code easily:

public static bool ContainsXHTML(this string input)
{
    try
    {
        XElement x = XElement.Parse("<wrapper>" + input + "</wrapper>");
        return !(x.DescendantNodes().Count() == 1 && x.DescendantNodes().First().NodeType == XmlNodeType.Text);
    }
    catch (XmlException ex)
    {
        return true;
    }
}

我发现的一个问题是纯文本 & 和小于字符会导致 XmlException 并指示该字段包含 HTML(这是错误的).为了解决这个问题,传入的输入字符串首先需要将&符号和小于字符转换为它们等效的 XHTML 实体.我写了另一个扩展方法来做到这一点:

One problem I found was that plain text ampersand and less than characters cause an XmlException and indicate that the field contains HTML (which is wrong). To fix this, the input string passed in first needs to have the ampersands and less than characters converted to their equivalent XHTML entities. I wrote another extension method to do that:

public static string ConvertXHTMLEntities(this string input)
{
    // Convert all ampersands to the ampersand entity.
    string output = input;
    output = output.Replace("&amp;", "amp_token");
    output = output.Replace("&", "&amp;");
    output = output.Replace("amp_token", "&amp;");

    // Convert less than to the less than entity (without messing up tags).
    output = output.Replace("< ", "&lt; ");
    return output;
}

现在我可以获取用户提交的字符串并使用以下代码检查它是否不包含 HTML:

Now I can take a user submitted string and check that it doesn't contain HTML using the following code:

bool ContainsHTML = UserEnteredString.ConvertXHTMLEntities().ContainsXHTML();

我不确定这是否是防弹的,但我认为这对我的情况来说已经足够了.

I'm not sure if this is bullet proof, but I think it's good enough for my situation.

这篇关于如何使用 C# 验证字符串不包含 HTML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆