将结构化数据存储在数据库列中? [英] Storing structured data in a database column?

查看:136
本文介绍了将结构化数据存储在数据库列中?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我和同事讨论过,是否将结构化数据(如XML或JSON)存储在数据库列中而不是创建子表是一个好主意。例如,假设我们需要存储有关问题的信息。两种类型的问题是多项选择和评级(例如从1-10)。我通常在如下结构中创建:

 
--------------------------------------------- ---------
Question | ID,Title,QuestionTypeId
Question_MultipleChoice | QuestionId,Choice
Question_Rating | QuestionId,Min,Max
QuestionTypes | ID,TypeName

我的同事认为最好将信息存储在一个问题包含subinfo列的表。例如:

 问题
----------------
ID
标题
SubInfo< - JSON

通过避免JOINS使查询更简单和可能更快。有什么原因应该避免这种类型的数据库结构?看起来如果你需要查询基于 SubInfo 列中的数据,这将是一个坏主意,但如果不需要,这是一个合理的数据库结构吗?

解决方案

调查是一种情况,我认为没有规范化和存储JSON很多 em>是更好的选择。



没有它,你会遇到各种各样的奇怪的用例,你最终会想要管理路。除了各种整齐的多项选择题,你还需要管理其他答案,条件问题,条件问题组,列表继续。更重要的是,调查与其他形式的数据一样,可能会发生变化,事情从gawdawful到nuclear。



JSON的优点是,调查在概念上彼此独立,您几乎不需要从一个到下一个的引用完整性,所以您可能会将整个问题和选项树存储为一个JSON blob,并担心在应用程序中格式化它。



对于每个提交的答案,同样的:取原来的blob,标记相关答案为选择等等,并存储结果JSON ,而不是存储对原始问题的引用以及任何回答。这将允许您轻松跟踪用户实际回答的问题,而不是当前版本的调查,而不管调查从最初回答以来有多少分歧。 / p>

如果你需要以后挖掘答案,请注意,Postgres允许在整个字段上使用GIST索引来索引JSON,并且BTREE对表达式进行索引。


I have been having a debate with a coworker about whether it would be a good idea to store structured data (such as XML or JSON) in a database column instead of creating subtables. For example, say we need to store information about questions. The two types of questions are Multiple choice and Rating (rate from 1-10 for example). I would typically create at structure like the one below:

Table                   |   Columns
------------------------------------------------------
Question                | ID, Title, QuestionTypeId
Question_MultipleChoice | QuestionId, Choice
Question_Rating         | QuestionId, Min, Max
QuestionTypes           | ID, TypeName

My co-worker believes it would be better to store information in a single Question table with a column for subinfo. For example:

Question
----------------
ID
Title
SubInfo  <-- JSON

Because it would make queries simpler and possibly faster by avoiding JOINS. Are there reasons that this type of database structure should be avoided? It seems like if you need to query based on the data in the SubInfo column this would be a bad idea, but if that is not needed, is this a reasonable database structure?

解决方案

Speaking very personally, surveys are one case where I think normalizing nothing and storing JSON pretty much as is is the better option.

Without it, you're going to end up with all sorts of bizarre use-cases that you'll eventually want to manage down the road. In addition to tidy multiple choice questions of all sorts, you'll also need to manage that "Other" answer in them, condition questions, conditional groups of questions, the list goes on and on. What more, surveys are — like other forms of data — subject to change, and things go from gawdawful to nuclear when they do.

The merit of JSON is that, since surveys are conceptually independent from one another, you've little to no need for referential integrity from one to the next, so you might as well store the entire tree of questions and options as one JSON blob, and worry about formatting it in your app.

The same for each submitted answer, for that matter: take the original blob, mark the relevant answer as selected and so forth within that, and store the resulting JSON as is, rather than storing references to the the original questions alongside whatever was answered. This will allow you to readily keep track of what users actually answered, as opposed to whatever the current version of the survey says, and do irrespective of how much the survey has diverged since it was originally answered.

If you need to mine the answers later, note that Postgres allows to index JSON using GIST indexes on the whole field, and BTREE indexes on expressions.

这篇关于将结构化数据存储在数据库列中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆