如何从十六进制值识别编码? [英] How to identify encoding from hex values?

查看:149
本文介绍了如何从十六进制值识别编码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在网站上显示的文字如下:而不是ö

I have text on a website that displays like that: instead of ö

我从CMS中提取了文本并分析了它的十六进制值:

I extracted the text out of the CMS and analysed it's hex values:


  • 显示正确的 c3 b6 -UTF-8

  • 显示不正确的ö 6f cc 88

  • the ö's that are displays correctly have c3 b6 - UTF-8
  • the ö's that are displayed incorrect have 6f cc 88

我找不到这是什么编码。识别编码的好方法是什么?

I couldn't find out what encoding this is. What's a good way to identify the encoding?

推荐答案

6F 是UTF-8(ASCII)编码为 o,没什么特别的。

CC 88 U + 0308,将DIAERESIS合并

6F is the UTF-8 (ASCII) encoding of "o", nothing spectacular.
CC 88 is the UTF-8 encoding of U+0308, COMBINING DIAERESIS.

您只是在查看o-umlaut的分解形式。应该直观地显示一个组合的日耳曼字符 ,并与先前的字符进行 combined 。如果您的系统没有执行此操作,则意味着它无法正确处理Unicode,并且/或者您选择的字体有些破损。也许您必须将您的字符串 normalise 替换为组合的Unicode形式,您的系统可以正确处理它。

You're simply looking at the decomposed form of the o-umlaut. A combining diaereses character should visually be rendered, well, combined with the previous character. If your system doesn't do that, it means it doesn't treat Unicode correctly, and/or the font you have chosen is somewhat broken. Perhaps you have to normalise your strings into the composed Unicode form instead for your system to handle it correctly.

这篇关于如何从十六进制值识别编码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆