如何在关系数据库中建模多语言实体 [英] How to model multilingual entities in relational databases

查看:122
本文介绍了如何在关系数据库中建模多语言实体的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我们要开发一个多语言的应用程序,我们应该将翻译存储在资源文件还是数据库



假设我们选择在数据库中这样做。有一种在关系模型中建立多语言实体的标准方法?



1。一个大翻译表



我们可以将所有翻译存储在一个表格中,并使用语言中性键作为属性值。



个人( SSN ,名字,姓氏,生日)



翻译( / strong>, langid ,翻译)



2。每个实体的一个翻译表



个人( SSN ,生日)



PersonML ( SSN LangId ,FirstName,LastName)



我更喜欢这种方法。这是一个 1:N关系



问题



列不能用于形成主键

我们假设每个人都有一个唯一的名称,然后(FirstName,LastName)可以用作主键。



个人(名字姓氏,生日)



考虑到多种语言时,(FirstName,LastName)无法识别一个人。

显然我们不能添加LangId来形成主键。



Person( LangId FirstName 姓氏,生日)



例如,一个人将被存储在多行,非键列将被复制。



我们必须使用语言中立的列对于主键?

当没有这样的列时,我们是否使用代理

我被告知那个代理人不应盲目使用,我非常同意。






更新1



在示例中, em> FirstName 和 LastName 需要本地化。



如果每个实体总是存在像SSN这样的属性,则第二种方法更有意义。

但是,一些有效的主键可能会变得无效如果它们包含需要本地化的列。



另一个例子



每个公司都有一个唯一的名称,所以CompanyName可用作主键。



公司( CompanyName ,...)



当涉及本地化时,公司名称不能用作主键。我们必须发明一些代码来代表公司。



这是否意味着本地化不适合关系模型?






更新2



3。 1:N默认语言与其他语言之间的关系



用户可以将公司表视为:



公司( CompanyNameEnglish ,CompanyNameFrench,CompanyNameSpanish,...)



当然有重复的组,所以它打破1NF。



改进:



公司( CompanyNameEnglish ,...)



CompanyNameML( CompanyNameEnglish LangId ,CompanyName)



问题是我们必须提供默认(英文)名称,即使用户不需要它。

有些用户可能会提供英文名字,其他用户只能提供法语名字。

这个要求太过分了吗?



4。 DBMS本地化支持



PerformanceDBA 在他的评论中提出了这一点。

我会做更多的研究。

解决方案

我被告知代理不要盲目使用,我非常同意。

我也同意,盲目使用任何东西不是一个聪明的选择。



但是,不是每次使用代理键时,都会盲目完成。
请记住,主键不是确保唯一性的唯一方法。大多数(如果不是全部)关系数据库提供了独特的约束和独特的索引,并且应该明智地使用。
事实上,当在翻译表中存储多语言数据时,使用代理键可能会更好,然后使用自然语言。 阅读这篇文章,以便在自然和代理关键策略之间进行很好的比较。



为了回答您的问题,我将与每个实体的翻译表一起使用,在主体实体表中仅保留实体非文本数据(例如您个人的示例中的出生日期和性别) ),并将文本数据保存在翻译表中,其主键由语言id和实体表主键组成。

请注意,在这种情况下,实体表的主键必须为非语言,而不是语言依赖。


If we are going to develop a multilingual application, shall we store translations in resource files or the database?

Suppose we choose to do it in the database. Is there a standard way to model multilingual entities in the Relational Model?

1. One Big Translation Table

We can store all the translations in one table and use language-neutral keys for the attribute values.

Person (SSN, FirstName, LastName, Birthday)

Translation (key, langid, translation)

2. One Translation Table For Each Entity

Person (SSN, Birthday)

PersonML (SSN, LangId, FirstName, LastName)

I prefer this approach. It is really a 1:N relationship.

Problem

It seems multilingual columns cannot be used to form a Primary Key.
Let's assume every person has a unique name, then (FirstName, LastName) can be used as the primary key.

Person (FirstName, LastName, Birthday)

However, when taking multilingual into account, (FirstName, LastName) cannot identify a person.
Apparently we can't add LangId to form a primary key.

Person (LangId, FirstName, LastName, Birthday)

In this case, one person would be stored in multiple rows and the non-key columns would be duplicated.

Do we have to use language-neutral columns for Primary Keys?
When there are no such columns, shall we use a surrogate?
I have been told that surrogates should not be used blindly and I strongly agree.


Update 1

In the example, I assume FirstName and LastName are subject to localization.

If there is always some attribute like SSN for every entity, the second approach makes more sense.
However, some valid Primary Keys may become invalid if they contain columns that are subject to localization.

Another Example

Every company has a unique name, so CompanyName can be used as the Primary Key.

Company (CompanyName, ...)

When it comes to localization, company name cannot be used as a primary key. We have to invent some code to represent the company.

Does it mean localization doesn't fit in the Relational Model?


Update 2

3. 1:N Relationship Between Default Language and Other Languages

Users may perceive the company table as:

Company (CompanyNameEnglish, CompanyNameFrench, CompanyNameSpanish, ...)

Of course there are repeating groups, so it breaks 1NF.

Improved:

Company (CompanyNameEnglish, ...)

CompanyNameML (CompanyNameEnglish, LangId, CompanyName)

The problem is we have to provide the a default (English) Name even if it is not required by the user.
Some users may provide English names, others may provide French names ONLY.
Is this requirement too contrived?

4. DBMS Localization Support

PerformanceDBA brought this up in his comment.
I will do more research on it.

解决方案

"I have been told that surrogates should not be used blindly and I strongly agree."
I also agree with that, using anything blindly is never a smart choice.

However, not every time you use surrogate key it is done blindly. Keep in mind that a primary key is not the only way to ensure uniqueness. Most if not all relational databases offers unique constraints and unique indexes, and it should be used wisely. In fact, when storing multilingual data in translation tables, using a surrogate key might be better then using a natural one. read this article for a good comparison between natural and surrogate key strategies.

To answer your question, I would go with a translation table for each entity, keeping only the entity non-textual data in the main entity table (such as birth date and gender in your person's example), and keeping the textual data in the translation table, having it's primary key composed of the language id and the entity table primary key.
Note that the primary key of the entity table in this case must be non-textual, and not language-depended.

这篇关于如何在关系数据库中建模多语言实体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆