未找到时将Lambda脚本定向到后备S3域子文件夹 [英] Lambda script to direct to fallback S3 domain subfolder when not found

查看:110
本文介绍了未找到时将Lambda脚本定向到后备S3域子文件夹的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据此问题,然后一个以下代码,使我可以将S3存储桶中的子文件夹指向我的域.

As per this question, and this one the following piece of code, allows me to point a subfolder in a S3 bucket to my domain.

但是在未找到子域的情况下,我收到以下错误消息:

However in instances where the subdomain is not found, I get the following error message:

<Error>
<Code>AccessDenied</Code>
<Message>Access Denied</Message>
<RequestId>2CE9B7837081C817</RequestId>
<HostId>
T3p7mzSYztPhXetUu7GHPiCFN6l6mllZgry+qJWYs+GFOKMjScMmRNUpBQdeqtDcPMN3qSYU/Fk=
</HostId>
</Error>

我不希望它显示此错误消息,而是在这样的实例中,我想从另一个S3存储桶子域(即example-bucket.s3-website.us-east-2.amazonaws.com/error)提供服务,例如,将向用户发送一些精美的错误消息.因此,在未找到S3存储桶子文件夹的情况下,它应该回到那里.如何通过更改下面的节点功能来实现此目的.

I would not like it to display this error message, instead in instances like this I would like to serve from another S3 bucket subdomain (i.e. example-bucket.s3-website.us-east-2.amazonaws.com/error) for example where the user will be greeted with a fancy error message. So therefore in a situation where a S3 bucket subfolder is not found, it should fall back to there. How do I accomplish this by changing the node function below.

'use strict';

// if the end of incoming Host header matches this string, 
// strip this part and prepend the remaining characters onto the request path,
// along with a new leading slash (otherwise, the request will be handled
// with an unmodified path, at the root of the bucket)

const remove_suffix = '.example.com';

// provide the correct origin hostname here so that we send the correct 
// Host header to the S3 website endpoint

const origin_hostname = 'example-bucket.s3-website.us-east-2.amazonaws.com'; // see comments, below

exports.handler = (event, context, callback) => {
    const request = event.Records[0].cf.request;
    const headers = request.headers;
    const host_header = headers.host[0].value;

    if(host_header.endsWith(remove_suffix))
    {
        // prepend '/' + the subdomain onto the existing request path ("uri")
        request.uri = '/' + host_header.substring(0,host_header.length - remove_suffix.length) + request.uri;
    }

    // fix the host header so that S3 understands the request
    headers.host[0].value = origin_hostname;

    // return control to CloudFront with the modified request
    return callback(null,request);
};

推荐答案

Lambda @ Edge函数是一个起源请求触发器-在检查CloudFront缓存并发生缓存未命中后运行,将请求立即发送到原始服务器之前(按触发代码修改后的状态).在响应从原点到达时,此代码已完成,无法用于修改响应.

The Lambda@Edge function is an origin request trigger -- it runs after the CloudFront cache is checked and a cache miss has occurred, immediately before the request (as it stands after being modified by the trigger code) is sent to the origin server. By the time the response arrives from the origin, this code has finished and can't be used to modify the response.

有几种解决方案,包括一些在概念上有效但效率极低的解决方案.不过,为了全面起见,我将提及这些以及更干净/更好的解决方案.

There are several solutions, including some that are conceptually valid but extremely inefficient. Still, I'll mention those as well as the cleaner/better solutions, in the interest of thoroughness.

Lambda @ Edge具有 4种可能的触发条件点:

Lambda@Edge has 4 possible trigger points:

  • viewer-request-当请求首次到达CloudFront时,在检查缓存之前;会为每个请求触发.
  • origin-request-在确认请求为高速缓存未命中之后,但在将请求发送到原始服务器之前;仅在缓存未命中时触发.
  • origin-response-从源服务器返回响应之后(无论成功还是错误),但是在响应可能存储在缓存中并返回给查看器之前;如果此触发器修改了响应,则修改后的响应(如果可缓存)将存储在CloudFront缓存中,并返回给查看器;仅在缓存未命中时触发
  • viewer-response-查看者响应-立即从源或缓存返回给查看者的响应;为每个非错误响应均触发,除非该响应是由查看者请求触发器自发发出的,或者是将状态码设置为200的自定义错误文档的结果(确定的反模式,但仍然可能),或者是CloudFront生成的HTTP到HTTPS重定向的HTTP.

任何触发点都可以控制信号流,

Any of the trigger points can assume control of the signal flow, generate its own spontaneous response, and thus change what CloudFront would have ordinarily done -- e.g. if you generate a response directly from an origin-request trigger, CloudFront doesn't actually contact the origin... so what you could theoretically do is check S3 in the origin-request trigger to see if the request will succeed and generate a custom error response, instead. The AWS Javascript SDK is automatically bundled into the Lambda@Edge environmemt. Technically legitimate, this is probably a terrible idea in almost any case, since it will increase both costs and latency due to extra "look-ahead" requests to S3.

另一种选择是编写一个单独的起源响应触发器以检查错误,如果发生错误,则用触发代码中的自定义响应替换它.但是这种想法也被认为是不可行的,因为该触发器将触发对缓存未命中的所有响应,无论是成功还是失败,增加的成本和延迟,在大多数情况下都浪费时间.

Another option is to write a separate origin-response trigger to check for errors, and if occurs, replace it with a customized response from the trigger code. But this idea also qualifies as non-viable, since that trigger will fire for all responses to cache misses, whether success or failure, increasing costs and latency, wasting time for a majority of cases.

更好的主意(成本,性能,易用性)是 CloudFront自定义错误页面,它允许您定义一个特定的HTML文档,CloudFront将使用该HTML文档处理与指定代码匹配的每个错误(例如,如原始问题所述,拒绝访问403).处理这些错误时,CloudFront还可将403更改为404.这要求您在错误文件的源是存储桶的情况下做几件事:

A better idea (cost, performance, ease-of-use) is CloudFront Custom Error Pages, which allows you to define a specific HTML document that CloudFront will use for every error matching the specified code (e.g. 403 for access denied, as in the original question). CloudFront can also change that 403 to a 404 when handling those errors. This requires that you do several things when the source of the error file is a bucket:

  • 创建第二个指向存储桶的CloudFront原点
  • 创建一个新的缓存行为,该行为会将错误文件的一个路径(例如/shared/errors/not-found.html)准确地路由到新的源(这意味着您不能在任何子域上使用该路径-它将始终存在)随时将其直接发送到错误文件)
  • 为代码403配置CloudFront自定义错误响应以使用路径/shared/errors/not-found.html.
  • 至少在测试过程中,将错误缓存最小TTL设置为0,以避免对自己造成挫败感.请参阅我对此功能的撰写内容,但请忽略我说Customize Error Response设置为No".
  • create a second CloudFront origin pointing to the bucket
  • create a new cache behavior that routes exactly that one path (e.g. /shared/errors/not-found.html) to the error file over to the new origin (this means you can't use that path on any of the subdomains -- it will always go directly to the error file any time it's requested)
  • configure a CloudFront custom error response for code 403 to use the path /shared/errors/not-found.html.
  • set Error Caching Minimum TTL to 0, at least while testing, to avoid some frustration for yourself. See my write-up on this feature but disregard the part where I said "Leave Customize Error Response set to No".

但是...可能需要,也可能不需要,因为S3的网络托管功能还包括可选的自定义错误文档支持.您需要在原始存储桶中创建一个HTML文件,在存储桶上启用网站托管功能,并将CloudFront原始域名更改为存储桶的网站托管端点,该端点在S3控制台中,但格式为${bucket}.s3-website.${region}.amazonaws.com.在某些区域中,出于传统原因,主机名在s3-website之后可能带有破折号-而不是点号.,但是点号格式在任何区域都适用.

But... that may or may not be needed, since S3's web hosting feature also includes optional Custom Error Document support. You'll need to create a single HTML file in your original bucket, enable the web site hosting feature on the bucket, and change the CloudFront Origin Domain Name to the bucket's web site hosting endpoint, which is in the S3 console but takes the form of${bucket}.s3-website.${region}.amazonaws.com. In some regions, the hostname might have a dash - rather than a dot . after s3-website for legacy reasons, but the dot format should work in any region.

我几乎不介意想到另一个选择,因为它相当高级,我担心描述可能看起来很复杂……但是您也可以执行以下操作,而且它会很巧妙,因为它可以您可能会为每个请求的错误URL生成一个自定义HTML页面.

I almost hesitate mention one other option that comes to mind, since it's fairly advanced and I fear the description might seem quite convoluted... but you also could do the following, and it would be pretty slick, since it would allow you to potentiallh generate a custom HTML page for each erroneous URL requested.

使用主存储桶创建CloudFront 来源组作为主要对象,使用第二个空的占位符"存储桶作为辅助对象.第二个存储桶的唯一目的是使我们为CloudFront提供一个计划与其连接的有效名称,即使实际上并没有连接到它,这在下面可能会变得很明显.

Create a CloudFront Origin Group with your main bucket as the primary and a second, empty, "placeholder" bucket as secondary. The only purpose served by the second bucket is so that we give CloudFront a valid name that it plans to connect to, even though we won't actually connect to it, as may become clear, below.

当请求到主要来源失败时,如果匹配已配置的错误状态代码之一,则会联系次要来源.这是为了处理起源发生故障的情况,但是我们可以利用它来达到目的,因为在实际联系故障转移起源之前,相同的起源请求触发器会再次触发.

When request fails to the primary origin, matching one of the configured error status codes, the secondary origin is contacted. This is intended for handling the case when an origin fails, but we can leverage it for our purposes, because before actually contacting the failover origin, the same origin request trigger fires a second time.

如果主要来源返回您为故障转移配置的HTTP状态代码,则当CloudFront将请求重新路由到第二个来源时,Lambda函数将再次触发.

If the primary origin returns an HTTP status code that you’ve configured for failover, the Lambda function is triggered again, when CloudFront re-routes the request to the second origin.

https://docs.aws.amazon .com/AmazonCloudFront/latest/DeveloperGuide/high_availability_origin_failover.html#concept_origin_groups.lambda

(因为触发器首先触发,所以说"...当CloudFront 准备将请求重定向到第二个来源时会更准确")

(It would be more accurate to say "...when CloudFront is preparing to re-route the request to the second origin," because the trigger fires first.)

第二次触发触发器时,不会保留触发触发器的特定原因,但是有可以识别您是否在第一次或第二次调用中运行:这两个值之一将包含CloudFront正在准备联系的原始服务器的主机名:

When the trigger fires a second time, the specific reason it fires isn't preserved, but there is a way to identify whether you're running in the first or second invocation: one of these two values will contain the hostname of the origin server CloudFront is preparing to contact:

event.Records[0].cf.request.origin.s3.domainName     # S3 rest endpoints
event.Records[0].cf.request.origin.custom.domainName # non-S3 origins and S3 website-hosting endpoints

因此,我们可以在触发器代码中测试适当的值(取决于原始类型),以查找第二个占位符"存储桶的名称.如果存在,请绕过当前逻辑并从Lambda函数内部生成404响应.这可以是动态的/自定义的HTML,例如页面URI,也可以是根据请求/还是其他页面而变化的HTML.如上所述,从起点请求触发器自动生成响应会阻止CloudFronr实际联系起点.起源请求触发器生成的响应限制为1MB,但这对于此用例来说应该绰绰有余.

So we can test the appropriate value (depending on origin type) in the trigger code, looking for the name of the second "placeholder" bucket. If it's there, bypass the current logic and generate the 404 response from inside the Lambda function. This could be dynamic/customized HTML, such as with the page URI or perhaps one that varies depending on whether / or some other page is requested. As noted above, spontaneously generating a response from an origin-request trigger prevents CloudFronr from actually contacting the origin. Generated responses from an origin-request trigger are limited to 1MB but that should be beyond sufficient for this use case.

这篇关于未找到时将Lambda脚本定向到后备S3域子文件夹的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆