Objective-C/Cocoa Touch 中的 HTML 字符解码 [英] HTML character decoding in Objective-C / Cocoa Touch

查看:26
本文介绍了Objective-C/Cocoa Touch 中的 HTML 字符解码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,我发现了这个:Objective C HTML escape/unescape,但它对我不起作用.

First of all, I found this: Objective C HTML escape/unescape, but it doesn't work for me.

我的编码字符(来自 RSS 提要,顺便说一句)如下所示:&

My encoded characters (come from a RSS feed, btw) look like this: &

我在网上搜索并找到了相关讨论,但没有修复我的特定编码,我认为它们被称为十六进制字符.

I searched all over the net and found related discussions, but no fix for my particular encoding, I think they are called hexadecimal characters.

推荐答案

那些被称为 字符实体引用.当它们采用&#; 的形式时,它们被称为数字实体引用.基本上,它是应该替换的字节的字符串表示形式.以&为例,表示ISO-8859-1字符编码方案中值为38的字符,即&.

Those are called Character Entity References. When they take the form of &#<number>; they are called numeric entity references. Basically, it's a string representation of the byte that should be substituted. In the case of &#038;, it represents the character with the value of 38 in the ISO-8859-1 character encoding scheme, which is &.

&符号必须在 RSS 中编码的原因是它是一个保留的特殊字符.

The reason the ampersand has to be encoded in RSS is it's a reserved special character.

您需要做的是解析字符串并用与&#; 之间的值匹配的字节替换实体.我不知道在目标 C 中有什么好方法可以做到这一点,但是 这个堆栈溢出问题 可能会有所帮助.

What you need to do is parse the string and replace the entities with a byte matching the value between &# and ;. I don't know of any great ways to do this in objective C, but this stack overflow question might be of some help.

自从两年前回答这个问题以来,有一些很好的解决方案;请参阅下面@Michael Waterfall 的回答.

Since answering this some two years ago there are some great solutions; see @Michael Waterfall's answer below.

这篇关于Objective-C/Cocoa Touch 中的 HTML 字符解码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆