403尝试使用PUT将PDF作为Blob上载到S3存储桶时禁止 [英] 403 Forbidden when trying to upload PDF as blob to S3 bucket using PUT

查看:93
本文介绍了403尝试使用PUT将PDF作为Blob上载到S3存储桶时禁止的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从浏览器客户端上载PDF文件,而不会暴露任何凭据或任何不愉快的内容.基于,我认为可以做到,但似乎不可行我.

Upload a PDF file from a browser client without exposing any credentials or anything unsavory. Based on this, I thought it could be done, but it doesn't seem to work for me.

前提是:

  • 您基于提供给JavaScript AWS SDK一部分的函数的一组参数,从S3存储桶中请求一个预签名的URL

  • you request a pre-signed URL from an S3 Bucket based on a set of parameters supplied to a function that is part of the JavaScript AWS SDK

您将此URL提供给前端,后者可以使用它在S3存储桶中放置文件,而无需在前端使用任何凭据或身份验证.

you supply this URL to the frontend, which can use it to place a file in the S3 Bucket without needing to use any credentials or authentication on the frontend.

这部分很简单,对我有用.我只是通过这个小JS块从S3请求一个URL:

This part is simple and it works for me. I just request a URL from S3 with this little JS nugget:

const s3Params = {
    Bucket: uploadBucket,
    Key: `${fileId}.pdf`,
    ContentType: 'application/pdf',
    Expires: 60,
    ACL: 'public-read',
}

let uploadUrl = s3.getSignedUrl('putObject', s3Params);

使用预签名URL将文件上传到S3

这是无效的部分,我不知道为什么. 这小段代码基本上是使用PUT请求将数据块发送到S3存储桶预签名的URL.

Use the Pre-Signed URL to Upload a File to S3

This is the part that doesn't work, and I can't figure out why. This little chunk of code basically sends a blob of data to the S3 bucket pre-signed URL using a PUT request.

const result = await fetch(response.data.uploadURL, {
        method: 'put',
        body: blobData,
});

PUT还是POST?

我发现使用任何POST请求都会导致400 Bad Request,因此将其放入.

PUT or POST?

I've found that using any POST requests results in 400 Bad Request, so PUT it is.

Content-Type (在我的情况下,该名称为application/pdf,因此为blobData.type)-它们在后端和前端之间是匹配的.

Content-Type (in my case, it'd be application/pdf, so blobData.type) -- they match between the backend and frontend.

x-amz-acl标头

更多内容类型

相似的用例.来看一看,似乎在PUT请求中不需要提供标头,并且签名的URL本身就是文件上传所必需的.

Similar use case. Looking at this one, it appears that no headers need to be supplied in the PUT request and the signed URL itself is all that is necessary for the file upload.

有点奇怪我不明白.看来我可能需要将文件的长度和类型传递给对S3的getSignedUrl调用.

Something weird that I don't understand. It looks like I may need to pass the length and type of the file to the getSignedUrl call to S3.

将我的存储桶公开(不存在)

Exposing my Bucket to the public (no bueno)

通过POST将文件上传到s3

...

uploadFile: async function(e) {
      /* receives file from simple input element -> this.file */
      // get signed URL
      const response = await axios({
        method: 'get',
        url: API_GATEWAY_URL
      });

      console.log('upload file response:', response);

      let binary = atob(this.file.split(',')[1]);
      let array = [];

      for (let i = 0; i < binary.length; i++) {
        array.push(binary.charCodeAt(i));
      }

      let blobData = new Blob([new Uint8Array(array)], {type: 'application/pdf'});
      console.log('uploading to:', response.data.uploadURL);
      console.log('blob type sanity check:', blobData.type);

      const result = await fetch(response.data.uploadURL, {
        method: 'put',
        headers: {
          'Access-Control-Allow-Methods': '*',
          'Access-Control-Allow-Origin': '*',
          'x-amz-acl': 'public-read',
          'Content-Type': blobData.type
        },
        body: blobData,
      });

      console.log('PUT result:', result);

      this.uploadUrl = response.data.uploadURL.split('?')[0];
    }

后端(fileReceiver.js):

'use strict';

const uuidv4 = require('uuid/v4');
const aws = require('aws-sdk');
const s3 = new aws.S3();

const uploadBucket = 'the-chumiest-bucket';
const fileKeyPrefix = 'path/to/where/the/file/should/live/';

const getUploadUrl = async () => {
  const fileId = uuidv4();
  const s3Params = {
    Bucket: uploadBucket,
    Key: `${fileId}.pdf`,
    ContentType: 'application/pdf',
    Expires: 60,
    ACL: 'public-read',
  }

  return new Promise((resolve, reject) => {
    let uploadUrl = s3.getSignedUrl('putObject', s3Params);
    resolve({
      'statusCode': 200,
      'isBase64Encoded': false,
      'headers': { 
        'Access-Control-Allow-Origin': '*',
        'Access-Control-Allow-Headers': '*',
        'Access-Control-Allow-Credentials': true,
      },
      'body': JSON.stringify({
        'uploadURL': uploadUrl,
        'filename': `${fileId}.pdf`
      })
    });
  });
};

exports.handler = async (event, context) => {
  console.log('event:', event);
  const result = await getUploadUrl();
  console.log('result:', result);

  return result;
}

无服务器配置(serverless.yml):

service: ocr-space-service

provider:
  name: aws
  region: ca-central-1
  stage: ${opt:stage, 'dev'}
  timeout: 20

plugins:
  - serverless-plugin-existing-s3
  - serverless-step-functions
  - serverless-pseudo-parameters
  - serverless-plugin-include-dependencies

layers:
  spaceOcrLayer:
    package:
      artifact: spaceOcrLayer.zip
    allowedAccounts:
      - "*"

functions:
  fileReceiver:
    handler: src/node/fileReceiver.handler
    events:
      - http:
          path: /doc-parser/get-url
          method: get
          cors: true
  startStateMachine:
    handler: src/start_state_machine.lambda_handler
    role: 
    runtime: python3.7
    layers:
      - {Ref: SpaceOcrLayerLambdaLayer}
    events:
      - existingS3:
          bucket: ingenio-documents
          events:
            - s3:ObjectCreated:*
          rules:
            - prefix: 
            - suffix: .pdf
  startOcrSpaceProcess:
    handler: src/start_ocr_space.lambda_handler
    role: 
    runtime: python3.7
    layers:
      - {Ref: SpaceOcrLayerLambdaLayer}
  parseOcrSpaceOutput:
    handler: src/parse_ocr_space_output.lambda_handler
    role: 
    runtime: python3.7
    layers:
      - {Ref: SpaceOcrLayerLambdaLayer}
  renamePdf:
    handler: src/rename_pdf.lambda_handler
    role: 
    runtime: python3.7
    layers:
      - {Ref: SpaceOcrLayerLambdaLayer}
  parseCorpSearchOutput:
    handler: src/node/pdfParser.handler
    role: 
    runtime: nodejs10.x
  saveFileToProcessed:
    handler: src/node/saveFileToProcessed.handler
    role: 
    runtime: nodejs10.x

stepFunctions:
  stateMachines:
    ocrSpaceStepFunc:
      name: ocrSpaceStepFunc
      definition:
        StartAt: StartOcrSpaceProcess
        States:
          StartOcrSpaceProcess:
            Type: Task
            Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-startOcrSpaceProcess"
            Next: IsDocCorpSearchChoice
            Catch:
            - ErrorEquals: ["HandledError"]
              Next: HandledErrorFallback
          IsDocCorpSearchChoice:
            Type: Choice
            Choices:
              - Variable: $.docIsCorpSearch
                NumericEquals: 1
                Next: ParseCorpSearchOutput
              - Variable: $.docIsCorpSearch
                NumericEquals: 0
                Next: ParseOcrSpaceOutput
          ParseCorpSearchOutput:
            Type: Task
            Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-parseCorpSearchOutput"
            Next: SaveFileToProcessed
            Catch:
              - ErrorEquals: ["SqsMessageError"]
                Next: CorpSearchSqsErrorFallback
              - ErrorEquals: ["DownloadFileError"]
                Next: CorpSearchDownloadFileErrorFallback
              - ErrorEquals: ["HandledError"]
                Next: HandledNodeErrorFallback
          SaveFileToProcessed:
            Type: Task
            Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-saveFileToProcessed"
            End: true
          ParseOcrSpaceOutput:
            Type: Task
            Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-parseOcrSpaceOutput"
            Next: RenamePdf
            Catch:
            - ErrorEquals: ["HandledError"]
              Next: HandledErrorFallback
          RenamePdf:
            Type: Task
            Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-renamePdf"
            End: true
            Catch:
              - ErrorEquals: ["HandledError"]
                Next: HandledErrorFallback
              - ErrorEquals: ["AccessDeniedException"]
                Next: AccessDeniedFallback
          AccessDeniedFallback:
            Type: Fail
            Cause: "Access was denied for copying an S3 object"
          HandledErrorFallback:
            Type: Fail
            Cause: "HandledError occurred"
          CorpSearchSqsErrorFallback:
            Type: Fail
            Cause: "SQS Message send action resulted in error"
          CorpSearchDownloadFileErrorFallback:
            Type: Fail
            Cause: "Downloading file from S3 resulted in error"
          HandledNodeErrorFallback:
            Type: Fail
            Cause: "HandledError occurred"

错误:

403禁止

403 Forbidden

PUT响应

响应{type:"cors",网址:"https://{bucket-name} .s3.{region-id} .amazonaw ... nedHeaders = host%3Bx-amz-acl& x-amz-acl =公开阅读",重定向:否,状态:403,确定:否,...} 身体: (...) bodyUsed:错误 标头:标头{} 好的:错误 重定向:false 状态:403 statusText:禁止" 类型:"cors" 网址:"https://{bucket-name} .s3.{region-id} .amazonaws.com/actionID.pdf?Content-Type = application%2Fpdf& X-Amz-Algorithm = SHA256& X-Amz-Credential = CREDZ-& X-Amz-Date = 20190621T192558Z& X-Amz-Expires = 900& X-Amz-Security-Token = {token}& X-Amz-SignedHeaders = host%3Bx-amz-acl& x- amz-acl =公开阅读" 原始:回复

Response {type: "cors", url: "https://{bucket-name}.s3.{region-id}.amazonaw…nedHeaders=host%3Bx-amz-acl&x-amz-acl=public-read", redirected: false, status: 403, ok: false, …} body: (...) bodyUsed: false headers: Headers {} ok: false redirected: false status: 403 statusText: "Forbidden" type: "cors" url: "https://{bucket-name}.s3.{region-id}.amazonaws.com/actionID.pdf?Content-Type=application%2Fpdf&X-Amz-Algorithm=SHA256&X-Amz-Credential=CREDZ-&X-Amz-Date=20190621T192558Z&X-Amz-Expires=900&X-Amz-Security-Token={token}&X-Amz-SignedHeaders=host%3Bx-amz-acl&x-amz-acl=public-read" proto: Response

我在想什么

我认为使用AWS S3 SDK提供给getSignedUrl调用的参数不正确,尽管它们遵循AWS文档建议的结构(解释为

What I'm Thinking

I'm thinking the parameters supplied to the getSignedUrl call using the AWS S3 SDK aren't correct, though they follow the structure suggested by AWS' docs (explained here). Aside from that, I'm really lost as to why my request is rejected. I've even tried exposing my Bucket to the public fully and it still didn't work.

阅读 ,我试图像这样构造我的PUT请求:

After reading this, I tried to structure my PUT request like this:

      let authFromGet = response.config.headers.Authorization;      

      const putHeaders = {
        'Authorization': authFromGet,
        'Content-Type': blobData,
        'Expect': '100-continue',
      };

      ...

      const result = await fetch(response.data.uploadURL, {
        method: 'put',
        headers: putHeaders,
        body: blobData,
      });

这将导致400 Bad Request而不是403;不同,但仍然是错误的.显然,在请求上放置任何标头都是错误的.

this resulted in a 400 Bad Request instead of a 403; different, but still wrong. It's apparent that putting any headers on the request is wrong.

推荐答案

深入研究此问题,是因为您试图将带有公共ACL的对象上载到不允许公共对象的存储桶中.

Digging into this, it's because you are trying to upload an object with a public ACL into a bucket that doesn't allow public objects.

  1. (可选)删除公共ACL语句或...

  1. Optionally remove the public ACL statement or...

确保将存储桶设置为任意一个

Ensure the bucket set to either

  • 公开可见或
  • 确保没有其他策略阻止公共访问(例如,您是否有一个帐户策略禁止公开查看的对象,但尝试使用公共ACL上传对象?)

基本上,您无法将具有公共ACL的对象上载到存储桶中,在存储桶中存在一些限制来阻止这样做-您将得到描述的403错误. HTH.

Basically, you cannot upload objects with a public ACL into a bucket where there is some restriction preventing that - you'll get the 403 error you describe. HTH.

这篇关于403尝试使用PUT将PDF作为Blob上载到S3存储桶时禁止的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆