存储长二进制(原始数据)字符串 [英] Storing long binary (raw data) strings

查看:170
本文介绍了存储长二进制(原始数据)字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正在捕获一个可变大小(从100k到800k)的原始二进制字符串,我们希望存储这些单独的字符串。他们不需要索引(duh),并且不会对该字段的内容进行查询。

We are capturing a raw binary string that is variable in size (from 100k to 800k) and we would like to store these individual strings. They do not need to be indexed (duh) and there will be no queries on the contents of the field.

这些插入的数量将非常大(它们是为了档案的目的),假设每天有10,000。这些大型二进制字符串的最佳字段类型是什么?应该是文本 blob 还是别的东西?

The quantity of these inserts will be very large (they are for archival purposes), let's say 10,000 per day. What is the best field type for large binary strings like these? Should it be text or blob or something else?

推荐答案

PostgreSQL 而言,键入 text 是不成问题的。它的速度较慢,使用更多的空间,比 bytea 更容易出错。

基本上有三种方法:

As far as PostgreSQL is concerned, type text is out of the question. It is slower, uses more space and is more error-prone than bytea for the purpose.
There are basically 3 approaches:


  1. 使用类型 bytea (基本上是相当于SQL blob类型的pg)

  1. use type bytea (basically the pg equivalent of the SQL blob type)

使用\"大型对象

将blob存储为文件系统中的文件,并且只将文件名
存储在数据库中。

store blobs as files in the filesystem and only store the filename in the database.

每个都有自己的优点和缺点。

Each has it's own advantages and disadvantages.


  1. 处理起来相当简单,但需要最多的磁盘空间。需要一些解码和编码,这也使得它也慢。备份在大小上快速增长!

  1. is rather simple to handle but needs the most disk space. Some decoding and encoding is required, which makes it also slow-ish. Backups grow rapidly in size!

在处理上有点尴尬,但是您有自己的基础设施来操纵blob - 如果你需要的话。您可以更轻松地进行单独的备份。

is slightly awkward in handling, but you have your own infrastructure to manipulate the blobs - if you should need that. And you can more easily make separate backups.

是目前为止最快的方式,使用最少的磁盘空间。但是,它不提供您在数据库中存储时获得的引用完整性。

is by far the fastest way and uses the least disk space. But it does not provide the referential integrity that you get when you store inside the database.

我有一些对于图像文件的实现:在字节字段中存储小缩略图以引用完整性和快速参考。将原始图像作为文件存储在文件系统中。当然,您需要考虑何时以及如何删除过期文件,如何备份外部文件等。

I have a number of implementations like that for image files: store a small thumbnail in a bytea-field for referential integrity and quick reference. Store the original image as file in the file-system. Of course, you need to put some thought into when and how to delete outdated files, how to backup the external files and such.

这篇关于存储长二进制(原始数据)字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆