从iPhone上的NSString中删除HTML标签 [英] Remove HTML Tags from an NSString on the iPhone
问题描述
有几种不同的方式可以从中移除
。 HTML标签
Cocoa
一种方法是将字符串转换为 NSAttributedString
,然后抓取所呈现的文本。
另一种方法是使用 NSXMLDocument的
- objectByApplyingXSLTString
方法应用 XSLT
不幸的是,iPhone不支持 NSAttributedString
或 NSXMLDocument
。有太多边缘情况和畸形的 HTML
文档让我感觉舒服使用regex或 NSScanner
。有人有解决方案吗?
一个建议是只是寻找开始和结束标签字符,这种方法将无法工作,除非是非常小的情况。
例如,这些情况(从同一主题的Perl Cookbook章节)将会破坏此方法:
< IMG SRC =foo.gifALT =A> B>
<! - < A comment> - >
< script> if(a< b&& a> c)< / script>
<![INCLUDE CDATA [>>>>>>>>>>>>>> ]]
;和>)解决方案,适用于iOS> = 3.2:
- (NSString *)stringByStrippingHTML {
NSRange r ;
NSString * s = [[self copy] autorelease];
while((r = [s rangeOfString:@< [^>] +>options:NSRegularExpressionSearch])location!= NSNotFound)
s = [s stringByReplacingCharactersInRange:r withString:@ ];
return s;
}
我已将此声明为类别os NSString。
There are a couple of different ways to remove HTML tags
from an NSString
in Cocoa
.
One way is to render the string into an NSAttributedString
and then grab the rendered text.
Another way is to use NSXMLDocument's
-objectByApplyingXSLTString
method to apply an XSLT
transform that does it.
Unfortunately, the iPhone doesn't support NSAttributedString
or NSXMLDocument
. There are too many edge cases and malformed HTML
documents for me to feel comfortable using regex or NSScanner
. Does anyone have a solution to this?
One suggestion has been to simply look for opening and closing tag characters, this method won't work except for very trivial cases.
For example these cases (from the Perl Cookbook chapter on the same subject) would break this method:
<IMG SRC = "foo.gif" ALT = "A > B">
<!-- <A comment> -->
<script>if (a<b && a>c)</script>
<![INCLUDE CDATA [ >>>>>>>>>>>> ]]>
A quick and "dirty" (removes everything between < and >) solution, works with iOS >= 3.2:
-(NSString *) stringByStrippingHTML {
NSRange r;
NSString *s = [[self copy] autorelease];
while ((r = [s rangeOfString:@"<[^>]+>" options:NSRegularExpressionSearch]).location != NSNotFound)
s = [s stringByReplacingCharactersInRange:r withString:@""];
return s;
}
I have this declared as a category os NSString.
这篇关于从iPhone上的NSString中删除HTML标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!