画布被CORS数据和S3污染 [英] Canvas tainted by CORS data and S3

查看:91
本文介绍了画布被CORS数据和S3污染的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的应用程序正在显示存储在 AWS S3 中的图像(出于安全原因,该图像存储在专用存储桶中).

My application is displaying images stored in AWS S3 (in a private bucket for security reasons).

为允许用户从其浏览器中查看图像,我生成了签名的URL ,如https://s3.eu-central-1.amazonaws.com/my.bucket/stuff/images/image.png?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=...&X-Amz-Date=20170701T195504Z&X-Amz-Expires=900&X-Amz-Signature=bbe277...3358e8&X-Amz-SignedHeaders=host.
<img src="S3URL" />可以正常使用:正确显示图像.
通过复制/粘贴图像的URL,我什至可以直接在另一个选项卡中查看图像.

To allow users to see the images from their browser I generate signed URLs like https://s3.eu-central-1.amazonaws.com/my.bucket/stuff/images/image.png?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=...&X-Amz-Date=20170701T195504Z&X-Amz-Expires=900&X-Amz-Signature=bbe277...3358e8&X-Amz-SignedHeaders=host.
This is working perfectly with <img src="S3URL" />: the images are correctly displayed.
I can even directly view the images in another tab by copy/pasting their URL.

我还生成了嵌入这些图像的PDF,这些图像需要先用canvas进行转换:调整大小并加水印.

I'm also generating PDFs embedding these images which need to be transformed before with a canvas: resized and watermarked.

但是我用来调整大小的库遇到了一些麻烦:

But the library I use for resizing is having some troubles:

Failed to execute 'getImageData' on 'CanvasRenderingContext2D':
The canvas has been tainted by cross-origin data.

实际上,我们处于 CORS 上下文中,但是我已经进行了所有设置,以便可以将图像显示给用户,并且确实可以使用.
因此,我不确定该错误的原因:这是否是另一个CORS安全层:浏览器担心我可能出于恶意目的更改图像?

Indeed we are in a CORS context but I've setup everything so that the images can be displayed to the user and indeed it's working.
So I'm not sure to understand the reason of this error: is this another CORS security layer: the browser fears that I might change the image in a malicious purpose?

我尝试在S3存储桶上设置允许的 CORS配置:

I've tried to set a permissive CORS configuration on the S3 bucket:

<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
    <CORSRule>
        <AllowedOrigin>*</AllowedOrigin>
        <AllowedMethod>GET</AllowedMethod>
        <AllowedMethod>POST</AllowedMethod>
        <AllowedMethod>PUT</AllowedMethod>
        <MaxAgeSeconds>3000</MaxAgeSeconds>
        <AllowedHeader>*</AllowedHeader>
    </CORSRule>
</CORSConfiguration>

然后在客户端上img.crossOrigin = ""img.crossOrigin = "Anonymous",但是我得到了:

And img.crossOrigin = "" or img.crossOrigin = "Anonymous" on the client-side but then I get:

Access to Image at 'https://s3.eu-central-1.amazonaws.com/...'
from origin 'http://localhost:5000' has been blocked by CORS policy:
No 'Access-Control-Allow-Origin' header is present on the requested resource.
Origin 'http://localhost:5000' is therefore not allowed access.

可能缺少哪些AWS/S3和/或客户端配置?

推荐答案

此处的一种解决方法是防止浏览器缓存下载的对象.这似乎源于S3与Chrome处理缓存对象的方式相互作用时可能出现的不正确行为.我最近回答了有关服务器故障的类似问题,您可以在此处找到更多详细信息.

One workaround here is to prevent the browser from caching the downloaded object. This seems to stem from arguably incorrect behavior on the part of S3 that interacts with the way Chrome handles cached objects. I recently answered a similar question on Server Fault, and you can find additional details, there.

当您从简单HTML(如<img>标签)从S3获取对象,然后在跨域上下文中再次获取同一对象时,似乎会出现问题.

The problem seems to arise when you fetch an object from S3 from simple HTML (like an <img> tag) and then fetch the same object again in a cross-origin context.

Chrome会缓存第一个请求的结果,然后使用该缓存的响应,而不是第二次发出新请求.当它检查缓存的对象时,没有Access-Control-Allow-Origin标头,因为它是从不受CORS规则约束的请求中缓存的...因此,当发出第一个请求时,浏览器没有发送Origin标头.因此,S3没有响应Access-Control-Allow-Origin标头(或任何与CORS相关的标头).

Chrome caches the result of the first request, and then uses that cached response instead of making a new request the second time. When it examines the cached object, there's no Access-Control-Allow-Origin header, because it was cached from a request that wasn't subject to CORS rules... so when that first request was made, the browser didn't send an Origin header. Because of that, S3 didn't respond with an Access-Control-Allow-Origin header (or any CORS-related headers).

问题的根源似乎与HTTP Vary:响应标头有关,该标头与缓存有关.

The root of the problem seems related to the HTTP Vary: response header, which is related to caching.

Web服务器(在本例中为S3)可以使用Vary:响应标头向浏览器发出信号,表明该服务器能够生成不止一个表示要返回的对象的表示形式,并且浏览器是否可以 >更改请求的属性,响应可能有所不同.当浏览器正在考虑使用缓存的对象时,应在断定该缓存的对象适合当前需求之前检查该对象在新上下文中是否有效.

A web server (S3 in this case) can use the Vary: response header to signal to the browser that the server is capable of producing more than one representation of the object being returned -- and if the browser would vary an attribute of the request, the response might differ. When the browser is considering using a cached object, it should check whether the object is valid in the new context, before concluding that the cached object is suited to the current need.

实际上,当您向S3发送Origin请求标头时,您会收到一个包含Vary: Origin的响应.这告诉浏览器,如果请求中发送的来源是一个不同的值,则响应也可能是不同的-例如,因为可能不允许所有来源.

Indeed, when you send an Origin request header to S3, you get a response that includes Vary: Origin. This tells the browser that if the origin sent in the request had been a different value, the response might also have been different -- for example, because not all origins might be allowed.

潜在问题的第一部分是,只要在存储桶中配置了CORS,即使浏览器未发送原始标头,S3(可以说总是)应该始终返回Vary: Origin ,因为可以针对您实际上未包含在请求中的标头指定Vary,以告诉您如果包含了标头,则响应可能会有所不同.但是,当Origin不存在时,它不会这样做.

The first part of the underlying problem is that S3 -- arguably -- should always return Vary: Origin whenever CORS is configured on the bucket, even if the browser didn't send an origin header, because Vary can be specified against a header you didn't actually include in the request, to tell you that if you had included it, the response might have differed. But, it doesn't do that, when Origin isn't present.

问题的第二部分是Chrome(在查询内部缓存时)发现它已经拥有该对象的副本.播入缓存的响应不包含Vary,因此Chrome认为该对象对于CORS请求也完全有效.显然不是,因为当Chrome尝试使用该对象时,它会发现缺少跨域响应标头.假设Chrome收到了来自S3的原始请求的Vary: Origin响应,它将意识到第二个请求的临时请求标头包含Origin:,因此它将正确地获取该对象的另一个副本.如果这样做的话,问题就会消失-正如我们通过在对象上设置Cache-Control: no-cache来说明的那样,以防止Chrome对其进行缓存.但是,事实并非如此.

The second part of the problem is that Chrome -- when it consults its internal cache -- sees that it already has a copy of the object. The response that seeded the cache did not include Vary, so Chrome assumes this object is also perfectly valid for the CORS request. Clearly, it isn't, since when Chrome tries to use the object, it finds that the cross-origin response headers are missing. Presumably, had Chrome received a Vary: Origin response from S3 on the original request, it would have realized that its provisional request headers for the second request included Origin:, so it would correctly go and fetch a different copy of the object. If it did that, the problem would go away -- as we have illustrated by setting Cache-Control: no-cache on the object, preventing Chrome from caching it. But, it doesn't.

因此,我们通过在S3中的对象上设置Cache-Control: no-cache来解决此问题,以使Chrome不会缓存第一个,并为第二个请求正确的CORS请求,而不是尝试使用缓存的复制,将会失败.

So, we work around this by setting Cache-Control: no-cache on the object in S3, so that Chrome won't cache the first one, and will make the correct CORS request for the second one, instead of trying to use the cached copy, which will fail.

请注意,如果要避免在S3中更新对象以使其包含Cache-Control: no-cache响应,则可以使用另一种方法来解决此问题,而无需在S3中将标头实际添加到静止的对象中.实际上,还有两个选择:

Note that if you want to avoid updating your objects in S3 to include the Cache-Control: no-cache response, there is another option for solving this without actually adding the header to the objects at rest in S3. Actually, there are two more options:

S3 API尊重在 .将此内容添加到签名的URL将指示S3将标头添加到响应中,而不管与对象一起存储的Cache-Control元数据值(或缺少该值).您不能简单地将其附加到查询字符串中,而必须将其添加为URL签名过程的一部分.但是,一旦将其添加到代码中,您的对象将在响应标头中以Cache-Control: no-cache返回.

The S3 API respects a value passed in the query string of response-cache-control=no-cache. Adding this to the signed URL will direct S3 to add the header to the response, regardless of the Cache-Control metadata value stored with the object (or lack thereof). You can't simply append this to the query string -- you have to add it as part of the URL signing process. But once you add that to your code, your objects will be returned with Cache-Control: no-cache in the response headers.

或者,如果在呈现页面时可以为同一个对象分别生成这两个签名URL,则只需更改其中一个签名URL相对于另一个签名URL的到期时间即可.将其延长一分钟或类似的时间.将到期时间从一个更改为另一个,将强制两个已签名的URL不同,并且带有两个不同查询字符串的两个不同对象应由Chrome解释为两个单独的对象,这也应消除对第一个缓存对象的错误使用满足其他要求.

Or, if you can generate these two signed URLs for the same object separately when rendering the page, simply change the expiration time on one of the signed URLs, relative to the other. Make it one minute longer, or something along those lines. Changing the expiration time from one to the other will force the two signed URLs to be different, and two different objects with two different query strings should be interpreted by Chrome as two separate objects, which should also eliminate the incorrect usage of the first cached object to serve the other request.

这篇关于画布被CORS数据和S3污染的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆