人们姓名的所有可允许字符是什么? [英] What are all of the allowable characters for people's names?

查看:826
本文介绍了人们姓名的所有可允许字符是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有标准的AZ,az字符,但也有连字符,短划线,引号等。

There are the standard A-Z, a-z characters, but also there are hyphens, em dashes, quotes, etc.

此外,还有所有的国际字符,像umlauts等。

Plus, there are all of the international characters, like umlauts, etc.

因此,对于基于英语的系统,什么是完整的集合?其他语言的集合呢? UTF8,UTF16等呢?

So, for an English-based system, what's the complete set? What about sets for other languages? What about UTF8, UTF16, etc?

奖金问题:需要多少个名称字段,以及它们的最大长度是多少?

Bonus question: How many name fields are needed, and what are their maximum lengths?

编辑:人们的名称中包含两种不同类型的字符,那些作为上下文一部分的字符以及那些因结构原因而存在的字符。我不想限制或干扰上下文字符,但我需要处理结构性的。

There are definitely two different types of characters involved in people's names, those that are there as part of the context, and those that are there for structural reasons. I don't want to limit or interfere with the context characters, but I do need to deal with the structural ones.

例如,我有一个名称,用em dash隔开,但很难区分。为了使系统更容易搜索,我想要采取所有五种不同类型的破折号,并将它们映射到一个唯一的字符(减号),这样搜索者不需要具体知道最初输入的符号。

For example, I had a name come in that was separated by an em dash, but it was hard to distinguish that from the minus character. To make the system easier for searching, I want to take all five different types of dashes, and map them onto one unique character (minus), that way the searcher doesn't need to know specifically which symbol was initially entered.

问题是否存在破折号,可能还有引号,还有多少其他符号?

The problem exists for dashes, probably quotes as well, but also how many other symbols?

推荐答案

W3C有一篇很好的文章,名为解决问题(和可能的解决方案)的世界(它最初是Richard Ishida的两部分博客文章:第1部分第2部分

There's good article by the W3C called Personal names around the world that explains the problems (and possible solutions) pretty well (it was originally a two-part blog post by Richard Ishida: part 1 and part 2)

我个人认为:支持每个可打印的Unicode字符并且安全只提供一个字段name ,格式化名称。这样你可以存储几乎每种形式的名称。你可能需要一个更结构化的存储,但是不要期望能够以结构化的形式存储每一个组合,因为有太多不同的组合。

Personally I'd say: support every printable Unicode-Character and to be safe provide just a single field "name" that contains the full, formatted name. This way you can store pretty much every form of name. You might need a more structured storage, but then don't expect to be able to store every single combination in a structured form, as there are simply too many different ones.

这篇关于人们姓名的所有可允许字符是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆