如何设计数据库的翻译字典? [英] How to design a database for translation dictionary?

查看:254
本文介绍了如何设计数据库的翻译字典?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有数据库的单词和短语从exp。英语到15种其他语言,并且该列表中的每种语言都对其他15.对于一对,他们现在排在这样的一个表中(en - > de):




  • id_pair

  • word_en

  • word_de



为这个巨大的单词和短语列表创建数据库的最好方法是什么?
我知道我必须将每种主要语言与其他语言分开,并且可能会这样想:

  $ b ID | WORD 
1 | 'dictionary'

GERMAN
ID | WORD
1 | 'lexikon'
2 | 'wörterbuch'

TRANSLATION_EN_DE
ID_EN | ID_DE
1 | 1
1 | 2

这是标准化数据库的最好方法吗?但是什么是短语,我还需要如果有人输入单词dictionay,这也返回这本字典是好的,并为此翻译。 (我知道这可以找到在第一个表与sql查询,是最好的方式吗?)



也需要它按字母顺序每次,我会有很多新的条目,因此我可以在有人寻找翻译的单词/阶段之前和之后打印几个字。



我被困住,无法决定什么是最佳的优化方式。这些db一起超过15gb只是基于文本的翻译,和大约100k每天req,所以每毫秒值。 :)
任何帮助将被欣赏,thx!

解决方案

大量的连接表来覆盖所有可能的翻译组合。最重要的是,添加一个新语言将需要添加更多的表,重写查询,客户端代码等。



最好是以更广泛的方式做,类似



<关于TRANSLATION表,我建议还创建一个 CHECK(WORD_ID1 并创建索引{WORD_ID2 ,WORD_ID1}(与PK的相反方向),并且只用一行表示翻译的两个方向。



考虑集群 TRANSLATION表(如果您的DBMS支持)。


也需要按字母顺序所有时间


  SELECT * FROM WORD WHERE LANGUAGE_ID =:lid ORDER BY WORD_TEXT 
pre>

...可以使用UNIQUE约束{LANGUAGE_ID,WORD_TEXT}下的索引。


I have database with words and phrases from for exp. English to 15 other languages, and also for every language in that list to other 15. For one pair they are sort for now in one table like this (en -> de):

  • id_pair
  • word_en
  • word_de

What is the best way to create database for that huge list of words and phrases? I know that I must separate every primary language from others, and was thinking maybe like this:

ENGLISH
ID | WORD
1  | 'dictionary'

GERMAN
ID | WORD
1  | 'lexikon'
2  | 'wörterbuch'

TRANSLATION_EN_DE
ID_EN | ID_DE
1     | 1
1     | 2

Is this the best way to normalize DB? But what is with phrases, I need also if someone enter word "dictionay" that this returns also "This dictionary is good" and translation for that. (I know this can find in first table with sql query, is that best way?)

Also need it alphabetically all time, I will have lot of new entry daily, so I can print couple words before and after the word/phases someone looking for translate.

I'm stuck and cant decide what is the best way to optimize it. These db have all together more than 15gb just text based translation, and around 100k daily req, so every ms worth. :) Any help will be appreciate, thx!

解决方案

With separate table for each language, you'd need a large number of junction tables to cover all the possible translation combinations. On top of that, adding a new language would require adding more tables, rewriting the queries, client code etc.

It's better to do it in a more generalized way, similar to this:

Regarding the TRANSLATION table, I propose to also create a CHECK (WORD_ID1 < WORD_ID2) and create an index {WORD_ID2, WORD_ID1} (the opposite "direction" from the PK), and represent the both directions of the translation with only one row.

Consider clustering the TRANSLATION table if your DBMS supports that.

Also need it alphabetically all time

The query...

SELECT * FROM WORD WHERE LANGUAGE_ID = :lid ORDER BY WORD_TEXT

...can use the index underneath the UNIQUE constraint {LANGUAGE_ID, WORD_TEXT}.

这篇关于如何设计数据库的翻译字典?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆