Mysql:将数据数组存储在单个列中 [英] Mysql: Store array of data in a single column

查看:215
本文介绍了Mysql:将数据数组存储在单个列中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在此先感谢您的帮助.

嗯,这就是我的情况.我有一个Web系统,该系统基于超声波计创建的样本进行一些与噪声相关的计算.最初,数据库仅存储这些计算的结果.但是现在,我被要求也自己存储样本.每个样本只是一个300或600个数字的列表,每个数字都有一个小数.

Well, this is my situation. I have a web system that makes some noise-related calculations based on a sample, created by a sonometer. Originally, the database only stored the results of these calculations. But now, I have been asked to also store the samplings themselves. Each sample is only a list of 300 or 600 numbers with 1 decimal each.

因此,我想到的最简单的方法是在表中添加一列,该列存储给定样本的所有计算.此列应包含数字列表.

So, the simplest approach I have come up with is to add a column in the table that stores all the calculations for a given sample. This column should contain the list of numbers.

然后我的问题是:在单列中存储此数字列表的最佳方法是什么?

My question then: What is the best way to store this list of numbers in a single column?

注意事项:

  • 如果可以同时使用PHP和javascript读取列表,而不会造成更多麻烦,那就太好了.
  • 该列表仅在整体检索时才有用,这就是为什么我不愿对其进行规范化的原因.而且,在该列表上进行的计算非常复杂,并且已经用PHP和javascript进行了编码,因此我不会在给定列表的元素上进行任何SQL查询

此外,如果有比存储它更好的方法,我很想知道它们

Also, if there are better approaches than storing it, I would love to know about them

非常感谢,祝您有个愉快的一天/晚上:)

Thanks a lot and have a good day/evening :)

推荐答案

首先,您真的不想这样做. RDBMS中的一列是原子的,因为它只包含一个信息.试图在一个列中存储多个数据违反了第一标准格式.

First off, you really don't want to do that. A column in a RDBMS is meant to be atomic, in that it contains one and only one piece of information. Trying to store more than one piece of data in a column is a violation of first normal form.

如果绝对必须这样做,则需要将数据转换为可以存储为单个数据项(通常是字符串)的形式.您可以使用PHP的serialize()机制,XML解析(如果数据恰好是文档树),json_encode()等.

If you absolutely must do it, then you need to convert the data into a form that can be stored as a single item of data, typically a string. You could use PHP's serialize() mechanism, XML parsing (if the data happens to be a document tree), json_encode(), etc.

但是您如何有效地查询此类数据?答案是你做不到.

But how do you query such data effectively? The answer is you can't.

此外,如果以后有人接管您的项目,您真的会惹恼他们,因为数据库中的序列化数据非常难以使用.我知道,因为我继承了此类项目.

Also, if someone else takes over your project at a later date you're really going to annoy them, because serialized data in a database is horrid to work with. I know because I've inherited such projects.

我是否提到过您真的不想这样做?您需要重新考虑您的设计,以便可以根据原子行更轻松地存储它.例如,将另一个表用于此数据,并使用外键将其与主记录相关联.由于某种原因,它们被称为关系数据库.

Did I mention you really don't want to do that? You need to rethink your design so that it can more easily be stored in terms of atomic rows. Use another table for this data, for example, and use foreign keys to relate it to the master record. They're called relational databases for a reason.

更新:有人问我有关数据存储的要求,例如单行存储是否便宜.答案是,在典型情况下,不是,在答案是肯定的情况下,您所付出的代价是不值得的.

UPDATE: I've been asked about data storage requirements, as in whether a single row would be cheaper in terms of storage. The answer is, in typical cases no it's not, and in cases where the answer is yes the price you pay for it isn't worth paying.

如果使用2列依赖表(该样本所属记录的外键使用1列,单个样本使用1列),则每列将最坏地需要16个字节(longint键列为8个字节) ,则为8字节(双精度浮点数).对于100条1600字节的记录(忽略数据库开销).

If you use a 2 column dependant table (1 column for the foreign key of the record the sample belongs to, one for a single sample) then each column will require at worst require 16 bytes (8 bytes for a longint key column, 8 bytes for a double precision floating point number). For 100 records that's 1600 bytes (ignoring db overhead).

对于序列化的字符串,最好在字符串中每个字符存储1个字节.您不知道字符串将要持续多长时间,但是如果我们假设100个样本的所有存储数据都经过某些人为的巧合,都落在10000.00和99999.99之间,则小数点后只能有2位数字,那么您重新查看每个样本8个字节.在这种情况下,您只保存了外键的开销,因此所需的存储量为800字节.

For a serialized string, you store in the best case 1 byte per character in the string. You can't know how long the string is going to be, but if we assume 100 samples with all the stored data by some contrived coincidence all falling between 10000.00 and 99999.99 with there only ever being 2 digits after the decimal point, then you're looking at 8 bytes per sample. In this case, all you've saved is the overhead of the foreign keys, so the amount of storage required comes out at 800 bytes.

当然,这是基于许多假设的,例如字符编码始终为每个字符1个字节,组成样本的字符串不得超过8个字符,等等.

That of course is based on a lot of assumptions, such as the character encoding always being 1 byte per character, the strings that make up the samples never being longer than 8 characters, etc.

但是,当然,用于序列化数据的任何机制也会产生开销.绝对最简单的方法CSV表示在每个样本之间添加逗号.这会将n-1个字节添加到存储的字符串中.因此,上面的示例现在将是899个字节,并且采用了最简单的编码方案. JSON,XML甚至PHP序列化都添加了比这更多的开销字符,并且您很快就会拥有比1600字节长得多的字符串.所有这些都是在假设1字节字符编码的情况下进行的.

But of course there's also the overhead of whatever mechanism you use to serialize the data. The absolute simplest method, CSV, means adding a comma between every sample. That adds n-1 bytes to the stored string. So the above example would now be 899 bytes, and that's with the simplest encoding scheme. JSON, XML, even PHP serializations all add more overhead characters than this, and you'll soon have strings that are a lot longer than 1600 bytes. And all this is with the assumption of 1 byte character encoding.

如果您需要为样本建立索引,则对字符串的数据需求将更加不成比例地增长,因为在存储方面,字符串索引比浮点列索引要昂贵得多.

If you need to index the samples, the data requirements will grow even more disproportionately against strings, because a string index is a lot more expensive in terms of storage than a floating point column index would be.

当然,如果您的样本开始添加更多数字,则数据存储会进一步增加.即使在最佳情况下,也无法将39281.3392810的8个字节存储为字符串.

And of course if your samples start adding more digits, the data storage goes up further. 39281.3392810 will not be storable in 8 bytes as a string, even in the best case.

如果数据已序列化,则数据库将无法操作.您无法对样本进行分类,对样本进行任何数学运算,数据库甚至都不知道它们是数字!

And if the data is serialized the database can't manipulate. You can't sort the samples, do any kind of mathematical operations on them, the database doesn't even know they're numbers!

不过,老实说,如今的存储价格非常便宜,您可以花很少的钱购买多个TB驱动器.存储真的那么重要吗?除非您有数亿条记录,否则我怀疑是这样.

To be honest though, storage is ridiculously cheap these days, you can buy multiple TB drives for tiny sums. Is storage really that critical? Unless you have hundreds of millions of records then I doubt it is.

您可能想看看一本叫做《 SQL Antipatterns》的书

You might want to check out a book called SQL Antipatterns

这篇关于Mysql:将数据数组存储在单个列中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆