字符串使用的字符单元数 [英] Number of character cells used by string

查看:19
本文介绍了字符串使用的字符单元数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个使用 UTF-8 字符串输出文本表的程序,我需要测量字符串使用的等宽字符单元格的数量,以便我可以正确对齐它.如果可能,我想用标准函数来做到这一点.

I have a program that outputs a textual table using UTF-8 strings, and I need to measure the number of monospaced character cells used by a string so I can align it properly. If possible, I'd like to do this with standard functions.

推荐答案

来自 Unix/Linux 的 UTF-8 和 Unicode 常见问题解答:

在 C 中可以使用 mbstowcs(NULL,s,0) 以可移植的方式计算字符数.这适用于 UTF-8,就像任何其他支持的编码一样,只要选择了适当的语言环境.计算 UTF-8 字符串中字符数的硬接线技术是计算除 0x80 – 0xBF 范围内的字节之外的所有字节,因为这些只是连续字节而不是它们自己的字符.然而,令人惊讶的是,应用程序中很少需要计算字符数.

The number of characters can be counted in C in a portable way using mbstowcs(NULL,s,0). This works for UTF-8 like for any other supported encoding, as long as the appropriate locale has been selected. A hard-wired technique to count the number of characters in a UTF-8 string is to count all bytes except those in the range 0x80 – 0xBF, because these are just continuation bytes and not characters of their own. However, the need to count characters arises surprisingly rarely in applications.

这篇关于字符串使用的字符单元数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆