从iPhone上的NSString中删除HTML标签 [英] Remove HTML Tags from an NSString on the iPhone

查看:72
本文介绍了从iPhone上的NSString中删除HTML标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有几种不同的方式可以从中移除 HTML标签 Cocoa



一种方法是将字符串转换为 NSAttributedString ,然后抓取所呈现的文本。



另一种方法是使用 NSXMLDocument的 - objectByApplyingXSLTString 方法应用 XSLT 不幸的是,iPhone不支持 NSAttributedString NSXMLDocument 。有太多边缘情况和畸形的 HTML 文档让我感觉舒服使用regex或 NSScanner 。有人有解决方案吗?



一个建议是只是寻找开始和结束标签字符,这种方法将无法工作,除非是非常小的情况。



例如,这些情况(从同一主题的Perl Cookbook章节)将会破坏此方法:

 < IMG SRC =foo.gifALT =A> B> 

<! - < A comment> - >

< script> if(a< b&& a> c)< / script>

<![INCLUDE CDATA [>>>>>>>>>>>>>> ]]


解决方案

;和>)解决方案,适用于iOS> = 3.2:

   - (NSString *)stringByStrippingHTML {
NSRange r ;
NSString * s = [[self copy] autorelease];
while((r = [s rangeOfString:@< [^>] +>options:NSRegularExpressionSearch])location!= NSNotFound)
s = [s stringByReplacingCharactersInRange:r withString:@ ];
return s;
}



我已将此声明为类别os NSString。


There are a couple of different ways to remove HTML tags from an NSString in Cocoa.

One way is to render the string into an NSAttributedString and then grab the rendered text.

Another way is to use NSXMLDocument's -objectByApplyingXSLTString method to apply an XSLT transform that does it.

Unfortunately, the iPhone doesn't support NSAttributedString or NSXMLDocument. There are too many edge cases and malformed HTML documents for me to feel comfortable using regex or NSScanner. Does anyone have a solution to this?

One suggestion has been to simply look for opening and closing tag characters, this method won't work except for very trivial cases.

For example these cases (from the Perl Cookbook chapter on the same subject) would break this method:

<IMG SRC = "foo.gif" ALT = "A > B">

<!-- <A comment> -->

<script>if (a<b && a>c)</script>

<![INCLUDE CDATA [ >>>>>>>>>>>> ]]>

解决方案

A quick and "dirty" (removes everything between < and >) solution, works with iOS >= 3.2:

-(NSString *) stringByStrippingHTML {
  NSRange r;
  NSString *s = [[self copy] autorelease];
  while ((r = [s rangeOfString:@"<[^>]+>" options:NSRegularExpressionSearch]).location != NSNotFound)
    s = [s stringByReplacingCharactersInRange:r withString:@""];
  return s;
}

I have this declared as a category os NSString.

这篇关于从iPhone上的NSString中删除HTML标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆