如何去code HTML实体使用C? [英] How to decode HTML Entities in C?
问题描述
我很感兴趣,例如转义文本:&放大器;#x5c;
在C \\ 映射到有谁知道一个良好图书馆?
I'm interested in unescaping text for example: \
maps to \
in C. Does anyone know of a good library?
作为参考的XML和HTML字符实体引用的维基百科列表。
As reference the Wikipedia List of XML and HTML Character Entity References.
推荐答案
我有一些空闲时间今天从头写了德codeR:的entities.c , entities.h 。
I had some free time today and wrote a decoder from scratch: entities.c, entities.h.
与外部连接的唯一功能是
The only function with external linkage is
size_t decode_html_entities_utf8(char *dest, const char *src);
如果的src
是空指针,该字符串将从 DEST
服用,即实体将被取消codeD原地。否则,德codeD字符串将放在 DEST
- 这应该指向一个缓冲区足够大,以容纳的strlen(SRC)+ 1
字符 - 和的src
将保持不变。
If src
is a null pointer, the string will be taken from dest
, ie the entities will be decoded in-place. Otherwise, the decoded string will be put in dest
- which should point to a buffer big enough to hold strlen(src) + 1
characters - and src
will be unchanged.
该函数将返回去codeD字符串的长度。
The function will return the length of the decoded string.
请注意,我没有做任何广泛的测试,所以有错误的概率高...
Please note that I haven't done any extensive testing, so there's a high probability of bugs...
这篇关于如何去code HTML实体使用C?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!