一张还是多张? [英] One table or many?

查看:153
本文介绍了一张还是多张?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想设计一个应用程序来保存学术参考信息。问题是每种不同类型的参考(例如期刊文章,书籍,报纸文章等)需要不同的信息。例如,期刊引用需要期刊标题和文章标题,以及页码,而书籍需要发布者和期刊文章不需要的发布日期。



因此,我应该将所有引用存储在我的数据库中的一个表中,只是在不应用时将字段留空,或者应该有各种表,例如BookReferences,JournalReferences,NewspaperReferences,并在每个表中放入适当的引用一。问题是,它会使搜索所有的参考,而更困难,而且编辑也必须做更多的单独的可能。



计划使用Ruby on Rails为这个项目的方式,但我怀疑,这对设计问题有任何区别)



更新: / p>

有任何其他意见吗?我希望得到一个简单的答案,说一个特定的方法被认为是最好的 - 但像往常一样的东西不是那么简单。单表继承选项看起来很有趣,但没有很多信息,我可以很容易地找到 - 我可以在这个网站上发布另一个问题。



我分为 Olvak的回答 Corey的回答。 Corey的回答给了一个很好的理由为什么Olvak的不是最好的,但Olvak的答案给出了好的理由为什么Corey的不是最好的!我从来没有意识到这可能是如此困难...



任何进一步的建议,非常感激!

解决方案

我想为所有引用都使用单个表,但是对于不适用于所有引用类型的元数据,其他表(例如BookReference等等)。



搜索和查询不会更困难 - 毕竟,您可以创建一个视图,聚合所有信息,如单表解决方案,然后再查询该视图。



让一个表中的所有东西都有很多null可能看起来像是更简单的解决方案,但实际上会导致很多麻烦。例如:使用单独的表,您可以为每个BookReference定义必需的字段,但如果一切都在一个表中,则每个字段必须是可空的,因此是可选的。它也将更容易插入无效的数据,如书籍引用也错误地包含非空的日志名称。



编辑:有些人似乎害怕加入。 不要害怕加入!如果在几个查询中使用完全相同的连接,确实会很乏味,但在这种情况下,连接应在视图中定义,并且您的查询应该查询该视图。视图实际上是关系数据库中的基本抽象,你应该使用它们,因为你在代码中使用函数:避免重复,封装和创建抽象。



编辑:有关于性能的一些意见。很难猜测DB模式的性能,因为它通常是不直观的。例如,几个表之间的联接可以容易地比单个表的全表扫描更快 - 这一切都取决于查询的类型,数据的性质,可用的索引等等。此外,在许多数据库系统中,您可以使用物化视图等功能来优化不同查询的性能,而不会影响逻辑模型。 非正规化的表现大多是货运邪恶的日子IMHO,除非你是谷歌或Flickr。


I'm trying to design an application to hold academic reference information. The problem is that each different type of reference (eg. journal articles, books, newspaper articles etc) requires different information. For example a journal reference requires both a journal title and an article title, and also a page number, whereas a book requires a publisher and a publication date which journal articles do not require.

Therefore, should I have all the references stored in one table in my database and just leave fields blank when they don't apply, or should I have various tables such as BookReferences, JournalReferences, NewspaperReferences and put the appropriate references in each one. The problem then would be that it would make searching through all the references rather more difficult, and also editing would have to be done rather more separately probably.

(I'm planning to use Ruby on Rails for this project by the way, but I doubt that makes any difference to this design question)

Update:

Any more views on this? I hoped to get a simple answer saying that a particular method was definitely considered 'the best' - but as usual things aren't quite as simple as this. The Single-Table Inheritance option looks quite interesting, but there isn't much information on it that I can find very easily - I may post another question on this site about that.

I'm split between Olvak's answer and Corey's answer. Corey's answer gives a good reason why Olvak's isn't the best, but Olvak's answer gives good reasons why Corey's isn't the best! I never realised this could be so difficult...

Any further advice much appreciated!

解决方案

I'd go for having a single table for all references, but additional tables like BookReferences and so on for metadata not applicable for all reference types.

Searching and querying would not be more difficult - after all you could just create a view which aggregates all information as in the single-table solution, and then query that view further.

Having everything in one table with lots of nulls might seem like the simpler solution, but actually it will lead to lots of trouble. For example: With separate tables you can define which fields are required for every BookReference, but if everything is in one table, every field has to be nullable and therefore optional. It would also be easier to insert invalid data, like a book reference which also erroneously contains a non-null journal name.

Edit: Some people seem to fear joins. Don't fear the join! If you use the exact same join in several queries that would indeed be tedious, but in that case the join should be defined in a view, and you queries should query that view. Views are really the basic abstraction in relational databases, and you should use them for the same reasons you use functions in code: to avoid repetition, and to encapsulate and create abstractions.

Edit: There are some comments regarding performance. It's very hard to guess beforehand about performance of DB schemas, because it is often non-intuitive. For example a join between several tables can easily be faster than a full table scan of a single table - it all depends on the type of query, the nature of the data, the available indexes and so on. Additionally, in many database systems you can use features like materialized views to optimize performance for different queries without compromising the logical model. "Denormalization for performance" is mostly cargo cult these days IMHO, unless you are Google or Flickr.

这篇关于一张还是多张?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆