从 iPhone 上的 NSString 中删除 HTML 标签 [英] Remove HTML Tags from an NSString on the iPhone

查看:22
本文介绍了从 iPhone 上的 NSString 中删除 HTML 标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有几种不同的方法可以从 Cocoa 中的 NSString 中删除 HTML 标签.

There are a couple of different ways to remove HTML tags from an NSString in Cocoa.

一种方法是将字符串渲染为NSAttributedString 然后抓取渲染的文本.

One way is to render the string into an NSAttributedString and then grab the rendered text.

另一种方式是使用 NSXMLDocument 的code> -objectByApplyingXSLTString 方法来应用执行它的 XSLT 转换.

Another way is to use NSXMLDocument's -objectByApplyingXSLTString method to apply an XSLT transform that does it.

不幸的是,iPhone 不支持 NSAttributedStringNSXMLDocument.有太多的边缘情况和格式错误的 HTML 文档让我觉得使用 regex 或 NSScanner 感觉很舒服.有没有人有解决办法?

Unfortunately, the iPhone doesn't support NSAttributedString or NSXMLDocument. There are too many edge cases and malformed HTML documents for me to feel comfortable using regex or NSScanner. Does anyone have a solution to this?

一个建议是简单地寻找开始和结束标记字符,这种方法除了非常微不足道的情况外不起作用.

One suggestion has been to simply look for opening and closing tag characters, this method won't work except for very trivial cases.

例如,这些案例(来自 Perl Cookbook 中关于同一主题的章节)会破坏这种方法:

For example these cases (from the Perl Cookbook chapter on the same subject) would break this method:

<IMG SRC = "foo.gif" ALT = "A > B">

<!-- <A comment> -->

<script>if (a<b && a>c)</script>

<![INCLUDE CDATA [ >>>>>>>>>>>> ]]>

推荐答案

一个快速且肮脏"(删除 < 和 > 之间的所有内容)的解决方案,适用于 iOS >= 3.2:

A quick and "dirty" (removes everything between < and >) solution, works with iOS >= 3.2:

-(NSString *) stringByStrippingHTML {
  NSRange r;
  NSString *s = [[self copy] autorelease];
  while ((r = [s rangeOfString:@"<[^>]+>" options:NSRegularExpressionSearch]).location != NSNotFound)
    s = [s stringByReplacingCharactersInRange:r withString:@""];
  return s;
}

我将此声明为类别 os NSString.

I have this declared as a category os NSString.

这篇关于从 iPhone 上的 NSString 中删除 HTML 标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆