axios如何处理blob与arraybuffer作为responseType? [英] how does axios handle blob vs arraybuffer as responseType?

查看:309
本文介绍了axios如何处理blob与arraybuffer作为responseType?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 axios 下载一个zip文件.为了进行进一步处理,我需要获取已下载的原始"数据.据我所知,在Javascript中有两种类型:Blob和Arraybuffers.两者都可以在请求选项中指定为 responseType .

I'm downloading a zip file with axios. For further processing, I need to get the "raw" data that has been downloaded. As far as I can see, in Javascript there are two types for this: Blobs and Arraybuffers. Both can be specified as responseType in the request options.

下一步,需要解压缩zip文件.我为此尝试了两个库:js-zip和adm-zip.两者都希望数据是一个ArrayBuffer.到目前为止,我可以将Blob转换为缓冲区.转换之后,adm-zip总是很高兴提取zip文件.但是,除非已使用'arraybuffer'作为axios responseType 下载了zip,否则js-zip会抱怨文件已损坏.js-zip在从中获取的缓冲区上不起作用.

In a next step, the zip file needs to be uncompressed. I've tried two libraries for this: js-zip and adm-zip. Both want the data to be an ArrayBuffer. So far so good, I can convert the blob to a buffer. And after this conversion adm-zip always happily extracts the zip file. However, js-zip complains about a corrupted file, unless the zip has been downloaded with 'arraybuffer' as the axios responseType. js-zip does not work on a buffer that has been taken from a blob.

这让我很困惑.我认为 ArrayBuffer Blob 本质上只是底层内存上的视图.在下载内容作为Blob与缓冲区之间的性能可能有所不同.但是结果数据应该相同,对吧?

This was very confusing to me. I thought both ArrayBuffer and Blob are essentially just views on the underlying memory. There might be a difference in performance between downloading something as a blob vs buffer. But the resulting data should be the same, right ?

好吧,我决定尝试一下,发现了这一点:

Well, I decided to experiment and found this:

如果指定 responseType:'blob',则axios将 response.data 转换为字符串.假设您对该字符串进行哈希处理并获得哈希码A.然后将其转换为缓冲区.对于此转换,您需要指定一种编码.根据编码的不同,您会得到各种新的哈希,我们称它们为B1,B2,B3,....在将'utf8'指定为编码时,我会回到原始的哈希A.

If you specify responseType: 'blob', axios converts the response.data to a string. Let's say you hash this string and get hashcode A. Then you convert it to a buffer. For this conversion, you need to specify an encoding. Depending on the encoding, you will get a variety of new hashes, let's call them B1, B2, B3, ... When specifying 'utf8' as the encoding, I get back to the original hash A.

所以我想当以'blob'的形式下载数据时,axios会将其隐式转换为使用utf8编码的字符串.这似乎很合理.

So I guess when downloading data as a 'blob', axios implicitly converts it to a string encoded with utf8. This seems very reasonable.

现在,您指定 responseType:'arraybuffer'.Axios为您提供了一个缓冲区,作为 response.data .哈希缓冲区,您将获得哈希码C.此代码与A,B1,B2,...中的任何代码都不对应.

Now you specify responseType: 'arraybuffer'. Axios provides you with a buffer as response.data. Hash the buffer and you get a hashcode C. This code does not correspond to any code in A, B1, B2, ...

因此,当将数据下载为'arraybuffer'时,您会获得完全不同的数据吗?

So when downloading data as an 'arraybuffer', you get entirely different data?

现在对我来说,解压缩库js-zip抱怨如果数据以'blob'的形式下载.它实际上可能以某种方式损坏了.但是,adm-zip如何提取它呢?而且我检查了提取的数据,它是正确的.这种特定的zip存档可能只有这种情况,但令我惊讶的是.

It now makes sense to me that the unzipping library js-zip complains if the data is downloaded as a 'blob'. It probably actually is corrupted somehow. But then how is adm-zip able to extract it? And I checked the extracted data, it is correct. This might only be the case for this specific zip archive, but nevertheless surprises me.

这是我用于实验的示例代码:

Here is the sample code I used for my experiments:

//typescript import syntax, this is executed in nodejs
import axios from 'axios';
import * as crypto from 'crypto';

axios.get(
    "http://localhost:5000/folder.zip", //hosted with serve
    { responseType: 'blob' }) // replace this with 'arraybuffer' and response.data will be a buffer
    .then((response) => {
        console.log(typeof (response.data));

        // first hash the response itself
        console.log(crypto.createHash('md5').update(response.data).digest('hex'));

        // then convert to a buffer and hash again
        // replace 'binary' with any valid encoding name
        let buffer = Buffer.from(response.data, 'binary');
        console.log(crypto.createHash('md5').update(buffer).digest('hex'));
        //...

什么在这里产生差异,如何获得真实"下载的数据?

What creates the difference here, and how do I get the 'true' downloaded data?

推荐答案

来自 axios文档:

// `responseType` indicates the type of data that the server will respond with
// options are: 'arraybuffer', 'document', 'json', 'text', 'stream'
//   browser only: 'blob'
responseType: 'json', // default

'blob'是仅限浏览器",选项.

因此,从node.js中,当您设置 responseType:"blob" 时,实际上将使用"json" ,我认为这是对的后备未获取可解析的JSON数据时为文本" .

'blob' is a "browser only" option.

So from node.js, when you set responseType: "blob", "json"will actually be used, which I guess fallbacks to "text" when no parse-able JSON data has been fetched.

以文本形式提取二进制数据很容易生成损坏的数据.因为 Body.text()和许多其他API是 USVStrings (它们不允许未配对的代理代码点),并且由于响应被解码为UTF-8,因此二进制文件中的某些字节无法正确映射为字符,因此将被替换为(U + FFDD)替换字符,无法取回该字符之前的数据:您的数据已损坏.

Fetching binary data as text is prone to generate corrupted data. Because the text returned by Body.text() and many other APIs are USVStrings (they don't allow unpaired surrogate codepoints ) and because the response is decoded as UTF-8, some bytes from the binary file can't be mapped to characters correctly and will thus be replaced by � (U+FFDD) replacement character, with no way to get back what that data was before: your data is corrupted.

以下是一个摘要片段,以.png文件的标题 0x89 0x50 0x4E 0x47 为例.

Here is a snippet explaining this, using the header of a .png file 0x89 0x50 0x4E 0x47 as an example.

(async () => {

  const url = 'https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png';
  // fetch as binary
  const buffer = await fetch( url ).then(resp => resp.arrayBuffer());

  const header = new Uint8Array( buffer ).slice( 0, 4 );
  console.log( 'binary header', header ); // [ 137, 80, 78, 61 ]
  console.log( 'entity encoded', entityEncode( header ) );
  // [ "U+0089", "U+0050", "U+004E", "U+0047" ]
  // You can read more about  (U+0089) character here
  // https://www.fileformat.info/info/unicode/char/0089/index.htm
  // You can see in the left table how this character in UTF-8 needs two bytes (0xC2 0x89)
  // We thus can't map this character correctly in UTF-8 from the UTF-16 codePoint,
  // it will get discarded by the parser and converted to the replacement character
  
  // read as UTF-8 
  const utf8_str = await new Blob( [ header ] ).text();
  console.log( 'read as UTF-8', utf8_str ); // "�PNG"
  // build back a binary array from that string
  const utf8_binary = [ ...utf8_str ].map( char => char.charCodeAt( 0 ) );
  console.log( 'Which is binary', utf8_binary ); // [ 65533, 80, 78, 61 ]
  console.log( 'entity encoded', entityEncode( utf8_binary ) );
  // [ "U+FFDD", "U+0050", "U+004E", "U+0047" ]
  // You can read more about character � (U+FFDD) here
  // https://www.fileformat.info/info/unicode/char/0fffd/index.htm
  //
  // P (U+0050), N (U+004E) and G (U+0047) characters are compatible between UTF-8 and UTF-16
  // For these there is no encoding lost
  // (that's how base64 encoding makes it possible to send binary data as text)
  
  // now let's see what fetching as text holds
  const fetched_as_text = await fetch( url ).then( resp => resp.text() );
  const header_as_text = fetched_as_text.slice( 0, 4 );
  console.log( 'fetched as "text"', header_as_text ); // "�PNG"
  const as_text_binary = [ ...header_as_text ].map( char => char.charCodeAt( 0 ) );
  console.log( 'Which is binary', as_text_binary ); // [ 65533, 80, 78, 61 ]
  console.log( 'entity encoded', entityEncode( as_text_binary ) );
  // [ "U+FFDD", "U+0050", "U+004E", "U+0047" ]
  // It's been read as UTF-8, we lost the first byte.
  
})();

function entityEncode( arr ) {
  return Array.from( arr ).map( val => 'U+' + toHex( val ) );
}
function toHex( num ) {
  return num.toString( 16 ).padStart(4, '0').toUpperCase();
}

node.js中本来就没有Blob对象,因此axios并没有猴子补丁只是为了使它们可以返回响应,而其他任何人都无法使用它.

There is natively no Blob object in node.js, so it makes sense axios didn't monkey-patch it just so they can return a response no-one else would be able to consume anyway.

在浏览器中,您将获得完全相同的响应:

From a browser, you'd have exactly the same responses:

function fetchAs( type ) {
  return axios( {
    method: 'get',
    url: 'https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png',
    responseType: type
  } );
}

function loadImage( data, type ) {
  // we can all pass them to the Blob constructor directly
  const new_blob = new Blob( [ data ], { type: 'image/jpg' } );
  // with blob: URI, the browser will try to load 'data' as-is
  const url = URL.createObjectURL( new_blob );
  
  img = document.getElementById( type + '_img' );
  img.src = url;
  return new Promise( (res, rej) => { 
    img.onload = e => res(img);
    img.onerror = rej;
  } );
}

[
  'json', // will fail
  'text', // will fail
  'arraybuffer',
  'blob'
].forEach( type =>
  fetchAs( type )
   .then( resp => loadImage( resp.data, type ) )
   .then( img => console.log( type, 'loaded' ) )
   .catch( err => console.error( type, 'failed' ) )
);

<script src="https://unpkg.com/axios/dist/axios.min.js"></script>

<figure>
  <figcaption>json</figcaption>
  <img id="json_img">
</figure>
<figure>
  <figcaption>text</figcaption>
  <img id="text_img">
</figure>
<figure>
  <figcaption>arraybuffer</figcaption>
  <img id="arraybuffer_img">
</figure>
<figure>
  <figcaption>blob</figcaption>
  <img id="blob_img">
</figure>

这篇关于axios如何处理blob与arraybuffer作为responseType?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆