为什么 OpenURI 将小于 10kb 的文件视为 StringIO? [英] Why does OpenURI treat files under 10kb in size as StringIO?

查看:39
本文介绍了为什么 OpenURI 将小于 10kb 的文件视为 StringIO?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 open-uri 从远程网站获取图像,并将它们保存在我的 Ruby on Rails 应用程序中的本地服务器上.大多数图像显示没有问题,但有些图像没有显示.

I fetch images with open-uri from a remote website and persist them on my local server within my Ruby on Rails application. Most of the images were shown without a problem, but some images just didn't show up.

经过很长时间的调试会话后,我终于发现了(感谢 这篇博文),原因是 open-uri-libary 将小于 10kb 的文件视为 IO 对象而不是临时文件.

After a very long debugging-session I finally found out (thanks to this blogpost) that the reason for this is that the class Buffer in the open-uri-libary treats files with less than 10kb in size as IO-objects instead of tempfiles.

我按照 Micah Winkelspecht 的回答设法解决了这个问题 这个 StackOverflow 问题,我将以下代码放在初始化程序的文件中:

I managed to get around this problem by following the answer from Micah Winkelspecht to this StackOverflow question, where I put the following code within a file in my initializers:

require 'open-uri'
# Don't allow downloaded files to be created as StringIO. Force a tempfile to be created.
OpenURI::Buffer.send :remove_const, 'StringMax' if OpenURI::Buffer.const_defined?('StringMax')
OpenURI::Buffer.const_set 'StringMax', 0

到目前为止,这按预期工作,但我一直想知道,为什么他们首先将此代码放入库中?有谁知道具体的原因,为什么小于 10kb 的文件被当作 StringIO 处理?

This works as expected so far, but I keep wondering, why they put this code into the library in the first place? Does anybody know a specific reason, why files under 10kb in size get treated as StringIO ?

由于上述代码实际上为我的整个应用程序全局重置了此行为,因此我只想确保我没有破坏其他任何东西.

Since the above code practically resets this behaviour globally for my entire application, I just want to make sure that I am not breaking anything else.

推荐答案

进行网络编程时,您需要分配一个相当大的缓冲区,并发送和读取适合缓冲区的数据单元.但是,在处理文件(或有时称为 BLOB)时,您不能假设数据适合您的缓冲区.因此,您需要对这些大数据流进行特殊处理.

When one does network programming, you allocate a buffer of a reasonably large size and send and read units of data which will fit in the buffer. However, when dealing with files (or sometimes things called BLOBs) you cannot assume that the data will fit into your buffer. So, you need special handling for these large streams of data.

(有时适合缓冲区的数据单元称为数据包.然而,数据包实际上是第 4 层的东西,就像帧在第 2 层一样.由于这是第 7 层,因此最好将它们称为消息.)

(Sometimes the units of data which fit into the buffer are called packets. However, packets are really a layer 4 thing, like frames are at layer 2. Since this is happening a layer 7, they might better be called messages.)

对于大于 10K 的回复,open-uri 库会设置额外的开销来写入流对象.当低于 StringMax 大小时,它只包含消息中的字符串,因为它知道它可以放入缓冲区.

For replies larger than 10K, the open-uri library is setting up the extra overhead to write to a stream objects. When under the StringMax size, it just includes the string in the message, since it knows it can fit in the buffer.

这篇关于为什么 OpenURI 将小于 10kb 的文件视为 StringIO?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆