用Flask提供Matplotlib图像的习语有什么区别? [英] What is the difference among idioms for serving a Matplotlib image with Flask?

查看:278
本文介绍了用Flask提供Matplotlib图像的习语有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

网页搜索出现了几个简单的(没有记录的)例子(和很好的这里的答案)如何动态地服务Matplotlib使用Flask的数字;但也有这些的特点,他们之间的差异,困惑我。

一些使用低级IO并返回元组

  io = StringIO.StringIO()
plt.savefig(io,format ='png')
io.seek(0)
data = io.read()
返回数据,200,{'Content-type':'image / png'}

while 几个 其他使用不同的IO API并返回响应

  canvas = FigureCanvas(fig)
canvas.print_png(io)
response = make_response(io.getvalue())
response。 mimetype ='image / png'#or response.headers ['Content-Type'] ='image / png'
return response

然而其他采取不同的方法来编码和建立返回值

  io = StringIO.StringIO ()
fig.savefig(io,format ='png')
data = io.getvalue()。encode('base64')
返回html.format(data)

所有这些似乎都有效;但是我想知道他们分享的方法是否有特点,或者他们之间的差异是否有非显而易见的后果(例如,对于性能或适用于不同的场景)。

首先,


  • StringIO 所扮演的角色是什么?是准备提供任何形象的唯一方法吗?



在我庇护的Python生活中,我从未见过它使用之前,并不清楚为什么它似乎是服务器(二进制?)文件的过程中所需的一部分。

第二,我不知道不同的方法这些例子用来打包回应;特别是


  • 对于使用 seek 加上读取,与 getvalue ,或者做同样的事情;
  • 管理什么返回的方法之间的选择:一个元组与$ html.format 与a 响应(with make_response );为什么有些方法明确地设置了 Content-type ,而另一些则设置了编码(到base64)?



这些方法中的任何一种都被认为是最好的还是最流行的惯用方法?

解决方案


StringIO扮演的角色是什么?是否准备以
形式(任何形式)的唯一方法?

首先,不,这不是唯一的方法。 古典的方式是涉及到文件系统:


  1. 让matplotlib创建一个图。

  2. 持久地将相应的图像数据保存到文件系统中的一个文件中(这涉及到上下文切换到调用系统调用(如 write())的内核。

  3. 再次读取此文件的内容(通过 read()),内核可以为您读取文件系统。

  4. 将内容提供给客户端,使用定义良好的数据编码的HTTP响应以及正确设置标题。


$ b $步骤(3)和(4)涉及文件系统交互。也就是说,内核实际上是与硬件组件进行通信的。这需要花费时间(使用经典的hrad驱动器,写入光盘的几个字节可能需要几毫秒,因为访问时间很长)。现在,问题是:您是否需要将图像数据保存到磁盘?如果答案为否,那么您可以跳过与文件系统的整个交互节省一些时间,方法是将图像数据保存在Web应用程序的内存中处理。这就是 StringIO 适用于:


StringIO 是Python中一个非常通用的工具,它提供了类似文件的对象,而实际的数据从不委托给内核来写入文件系统或者从文件系统中读取。它保存在内存中。这就是为什么StringIO对象也被称为内存文件的原因。



关键是 plt.savefig() 有一个对象作为第一个参数,看起来像一个实际上代表文件系统中的真实文件的对象。 StringIO 提供了这样的一个对象,但是 - 在底层 - 将数据写入当前进程的堆中的缓冲区,并且如果请求再次从那里读取它。 / p>

通过 StringIO 读取/写入一小部分数据的时间是纳秒或微秒,而与文件系统的交互通常是现在,不要误解我的意思:通常,文件系统是足够快的,一个操作系统有自己的技术,使文件系统互动尽可能快。真正的问题是,如前所述:你需要图像数据持久?如果您不关心在稍后的某个时间访问此图像数据,则不要涉及文件系统。这是您所展示的三个片段的创建者所决定的。

出于性能考虑,使用StringIO 取代真正的文件系统交互可能是非常非常有效的决定。但是,在您的Web应用程序中肯定还有其他的瓶颈。例如,使用StringIO可以减少请求 - 响应延迟,比如说5毫秒。但考虑到100毫秒的网络延迟,这实际上是否重要?此外,请记住,一个严重的Web应用程序最好不要担心发送大文件的内容 - 这些更好地服务于一个完善的Web服务器,也可以使用 sendfile()系统调用。在这种情况下,让matplotlib将文件写入文件系统,然后告诉Web服务器(通过 X-Sendfile 头),休息一下。所以,表演是一个复杂的话题可能不是最强的论据。但是只有你知道你的要求!


使用seek plus read,getvalue,
还是有意义的做这些做基本相同的事情

基本上是同样的事情。什么是返回的方法之间的选择:什么决定了哪些方法之间的选择:什么是返回的:一个元组
与html.format对一个响应(带有make_response);最后是

没有确定的答案。有很多方法可以将数据提供给客户端。没有正确的方法,只是好或坏。哪种方法最好地取决于Web框架。使用Flask, make_response()是创建响应对象的规范方法。 html.format()可能有一些优点我不知道 - 你需要自己阅读这个!但是,请继续阅读,我认为Flask中内置了一种方法,它完全适合您的情况。


为什么做一些方法显式地设置内容类型,而另一些
设置编码(到'base64')?

是通过HTTP将文件发送到浏览器的正确和不正确的方法。通常,HTTP响应应该包含特定的头文件(另请参阅需要哪些HTTP响应头文件<一>)。只是为了您的理解,您可能想要了解这些细节。当然,二进制数据需要用客户端可以理解的编码进行编码,编码必须在响应头文件中加以说明。而且,正确的HTTP响应应该包含MIME类型(内容类型)。你提出的方法似乎并没有真正控制这个或那个(没有冒犯,快速和肮脏的例子通常比另一个更注重一件事)。

<我认为你应该使用Flask的
send_file 方法关心一些重要的事情给你。这个方法有两个参数。我会通过 mimetype 显式地定义MIME类型。第一个参数可以是一个类似文件的对象,所以一个StringIO对象工作正常。然而,在这种情况下,您需要
seek(0)之前做


在调用send_file()之前,确保文件指针位于
发送数据的开头。

以下两种方法在语义上是优雅的(在我看来),并且应该适当地注意对文件内容进行编码并设置HTTP响应标头:

  from flask import send_file 



<1>

  f = StringIO.StringIO()
plt.savefig(f,format ='png',dpi = 300)
f.seek (0)
send_file(f,mimetype ='image / png')



2)

  plt.savefig('image.png',dpi = 300)
send_file('image.png',在第二种情况下,你的web服务器(例如nginx)可以,如果正确配置,为您传输文件。


A Web search turns up several simple (undocumented) examples of (and good answers here about) how to dynamically serve Matplotlib figures using Flask; but there are features of these, and differences among them that puzzle me.

Some use low level IO and return tuples

io = StringIO.StringIO()
plt.savefig(io, format='png')
io.seek(0)
data = io.read()
return data, 200, {'Content-type': 'image/png'}

while several others use different IO APIs and return a Response

io = StringIO.StringIO()
canvas = FigureCanvas(fig)
canvas.print_png(io)
response = make_response(io.getvalue())
response.mimetype = 'image/png' # or response.headers['Content-Type'] = 'image/png'
return response

and yet others take a different approach to encoding and building the return value

io = StringIO.StringIO()
fig.savefig(io, format='png')
data = io.getvalue().encode('base64')
return html.format(data)

All of these seem to work; but I wonder if there are features of the approaches they share, or differences among them that have non-obvious consequences (e.g. for performance, or applicability to different scenarios).

First,

  • what is the role played by StringIO; is it the only way to prepare to serve an image (of any kind)?

In my sheltered Python life I've never seen it used before, and am unclear why it seems to be a required part of the process of server a (binary?) file.

Second, I wonder about the different approaches these examples take to packaging their response; specifically

  • is there any significance to the use of seek plus read, vs. getvalue, or do these do essentially the same thing;
  • what governs the choice among approaches for what is returned: a tuple vs. html.format vs. a Response (with make_response); and, finally
  • why do some approaches set the Content-type explicitly, while others set the encoding (to 'base64')?

Is any one of these approaches considered the "best" or most current idiomatic (or at least Pythonic) approach?

解决方案

what is the role played by StringIO; is it the only way to prepare to serve an image (of any kind)?

First of all, no, it is not the only way. The "classical" way would be to involve the file system:

  1. Let matplotlib create a plot.
  2. Persistently save the corresponding image data to a file in the file system (that involves context switches to the kernel which invokes system calls like write()).
  3. Read the contents of this file again (which lets the kernel read out the file system for you, via read()).
  4. Serve the contents to the client, in an HTTP response with well-defined data encoding as well as properly set headers.

Steps (3) and (4) involve file system interaction. That is, the kernel actually talks to hardware components. This takes time (with classical hrad drives, writing just a few bytes to the disc might take a couple of milliseconds, as of the long access times). Now, the question is: do you need to have the image data persisted to disk? If the answer is "no", then you can skip the entire interaction with the file system and save some time, by keeping the image data within the memory of your web application process. That is what StringIO is good for:

StringIO is a very generic tool in Python that provides file-like objects, whereas the actual data is never delegated to the kernel for writing it to the file system or reading it from the file system. It is kept in memory. That is why StringIO objects are also called in-memory files.

The point is that plt.savefig() wants to have an object as first argument that looks like an object that actually represents a real file in the file system. StringIO provides such an object, but -- under the hood -- writes data to a buffer in the heap of the current process, and reads it from there again if requested.

Reading/writing small portions of data via StringIO takes nanoseconds or microseconds, whereas the interaction with the file system usually is orders of magnitudes slower.

Now, don't get me wrong: usually, the file system is fast enough, and an operating system has its own techniques to make file system interaction as fast as possible. The real question is, as stated before: do you need the image data persisted? If you don't care about accessing this image data at some point later on, then do not involve the file system. This is what the creators of the three snippets you show decided.

Replacing real file system interaction with StringIO for performance reasons might be a very very valid decision. However, in your web application there surely are other bottlenecks. For instance, using StringIO may reduce the request-response latency by let's say 5 ms. But does this actually matter considering network latencies of 100 ms? Also, remember that a serious web application should better not be bothered with sending large file contents -- these are better served with a well-established web server which can also make use of the sendfile() system call. In this case, it might again be better performance-wise to let matplotlib write the file to the file system and then tell your web server (via an X-Sendfile header) to do the rest. So, performance is a complicated topic might not be the strongest argument. But only you know your requirements!

is there any significance to the use of seek plus read, vs. getvalue, or do these do essentially the same thing

Essentially the same thing. Does not make a conceptual difference, does not make a (significant) performance difference.

what governs the choice among approaches for what is returned: a tuple vs. html.format vs. a Response (with make_response); and, finally

No definite answer. There are many ways to get data to the client. There is no "correct" approach, just better or worse. Which approach to take best strongly depends on the web framework. With Flask, make_response() is the canonical way for creating a response object. html.format() might have some advantages I am not aware of -- you need to read about this yourself! But, read on, I think there is a method built into Flask which perfectly fits your scenario.

why do some approaches set the Content-type explicitly, while others set the encoding (to 'base64')?

There are proper and improper ways to send files to browsers via HTTP. Generally, an HTTP response should contain certain headers (also see What HTTP response headers are required). Just for your understanding, you might want to read about these details. Surely, binary data needs to be encoded with an encoding the client understands, and the encoding must be clarified in the response header. Also, a proper HTTP response should contain a MIME type (content type). The methods you have presented seem to not really take control of one or the other (no offense, quick & dirty examples often focus more on one thing than on the other).

I think you really should use Flask's send_file method which takes care of some important things for you. There are a couple of arguments to this method. I would explicitly define the MIME type via mimetype. The first argument can be a file-like object, so a StringIO object works fine. However, in this case you need to do seek(0) before:

Make sure that the file pointer is positioned at the start of data to send before calling send_file().

The following two approaches are semantically elegant (in my opinion) and should take proper care of encoding the file contents and setting HTTP response headers:

from flask import send_file 

1)

f = StringIO.StringIO()
plt.savefig(f, format='png', dpi=300)
f.seek(0)
send_file(f, mimetype='image/png')

2)

plt.savefig('image.png', dpi=300)
send_file('image.png', mimetype='image/png')

In the second case your webserver (e.g. nginx) can, if properly configured, transmit the file for you.

这篇关于用Flask提供Matplotlib图像的习语有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆