axios 如何将 blob 与 arraybuffer 作为 responseType 处理? [英] how does axios handle blob vs arraybuffer as responseType?

查看:42
本文介绍了axios 如何将 blob 与 arraybuffer 作为 responseType 处理?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在下载带有 axios 的 zip 文件.为了进一步处理,我需要获取已下载的原始"数据.据我所知,在 Javascript 中有两种类型:Blob 和 Arraybuffers.两者都可以在请求选项中指定为 responseType.

在下一步中,需要解压缩 zip 文件.我为此尝试了两个库:js-zip 和 adm-zip.两者都希望数据是一个 ArrayBuffer.到目前为止一切顺利,我可以将 blob 转换为缓冲区.在此转换之后,adm-zip 总是很高兴地提取 zip 文件.但是,js-zip 会抱怨文件已损坏,除非已使用 'arraybuffer' 作为 axios responseType 下载 zip.js-zip 对取自 blobbuffer 不起作用.

这让我很困惑.我认为 ArrayBufferBlob 本质上都只是对底层内存的看法.将某些内容下载为 blob 与缓冲区之间可能存在性能差异.但是结果数据应该是一样的吧?

好吧,我决定进行实验并发现:

如果指定 responseType: 'blob',axios 会将 response.data 转换为字符串.假设您对该字符串进行哈希处理并获得哈希码 A.然后将其转换为缓冲区.对于这种转换,您需要指定一种编码.根据编码的不同,你会得到各种新的散列,我们称它们为 B1、B2、B3,...当指定 'utf8' 作为编码时,我回到原来的散列 A.

所以我猜当下载数据为 'blob' 时,axios 隐式地将其转换为用 utf8 编码的字符串.这看起来很合理.

现在您指定 responseType: 'arraybuffer'.Axios 为您提供了一个缓冲区作为 response.data.对缓冲区进行哈希处理,您会得到一个哈希码 C.此代码不对应于 A、B1、B2 中的任何代码......

所以当下载数据作为 'arraybuffer' 时,你得到完全不同的数据?

现在对我来说,解压缩库 js-zip 会抱怨数据是否作为 'blob' 下载.它可能实际上以某种方式损坏了.但是 adm-zip 是如何提取它的呢?我检查了提取的数据,它是正确的.这可能只是这个特定 zip 存档的情况,但仍然让我感到惊讶.

这是我用于实验的示例代码:

//typescript导入语法,这个是在nodejs中执行的从 'axios' 导入 axios;从 'crypto' 导入 * 作为加密;axios.get("http://localhost:5000/folder.zip",//hosted with serve{ responseType: 'blob' })//将其替换为 'arraybuffer' 并且 response.data 将是一个缓冲区.then((响应) => {console.log(typeof (response.data));//首先散列响应本身console.log(crypto.createHash('md5').update(response.data).digest('hex'));//然后转换为缓冲区并再次散列//用任何有效的编码名称替换二进制"让缓冲区 = Buffer.from(response.data, 'binary');console.log(crypto.createHash('md5').update(buffer).digest('hex'));//...

这里有什么不同,我如何获得真实"的下载数据?

解决方案

来自 axios 文档:

<块引用>

//`responseType` 表示服务器将响应的数据类型//选项是:'arraybuffer'、'document'、'json'、'text'、'stream'//仅浏览器:'blob'responseType: 'json',//默认

'blob' 是仅浏览器";选项.

所以从 node.js 开始,当您设置 responseType: "blob" 时,"json" 将被实际使用,我猜这是对 的回退text",当没有获取到可解析的 JSON 数据时.

以文本形式获取二进制数据很容易产生损坏的数据.因为 Body.text() 和许多其他 API 是 USVStrings(它们不允许未配对的代理codepoints ) 并且由于响应被解码为 UTF-8,二进制文件中的某些字节无法正确映射到字符,因此将被替换为 (U+FFDD) 替换字符,无法取回数据之前:您的数据已损坏.

这是一个解释这一点的片段,以 .png 文件的标题 0x89 0x50 0x4E 0x47 为例.

(async() => {const url = 'https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png';//获取为二进制const buffer = await fetch( url ).then(resp => resp.arrayBuffer());const header = new Uint8Array(buffer).slice(0, 4);console.log('二进制标题', 标题);//[ 137, 80, 78, 61 ]console.log('实体编码',entityEncode(header));//[ "U+0089", "U+0050", "U+004E", "U+0047" ]//您可以在此处阅读有关 (U+0089) 字符的更多信息//https://www.fileformat.info/info/unicode/char/0089/index.htm//你可以在左表中看到这个字符在 UTF-8 中是如何需要两个字节的 (0xC2 0x89)//因此,我们无法从 UTF-16 代码点以 UTF-8 正确映射此字符,//它将被解析器丢弃并转换为替换字符//读取为 UTF-8const utf8_str = 等待新 Blob( [ header ] ).text();console.log( '读为 UTF-8', utf8_str );//" PNG"//从该字符串构建一个二进制数组const utf8_binary = [ ...utf8_str ].map( char => char.charCodeAt( 0 ) );console.log( '哪个是二进制', utf8_binary );//[ 65533, 80, 78, 61 ]console.log('实体编码', entityEncode(utf8_binary));//[ "U+FFDD", "U+0050", "U+004E", "U+0047" ]//您可以在此处阅读有关字符   (U+FFDD) 的更多信息//https://www.fileformat.info/info/unicode/char/0fffd/index.htm////P(U+0050)、N(U+004E)和G(U+0047)字符兼容UTF-8和UTF-16//对于这些没有编码丢失//(这就是 base64 编码使以文本形式发送二进制数据成为可能的方式)//现在让我们看看获取文本的内容const fetched_as_text = await fetch( url ).then( resp => resp.text() );const header_as_text = fetched_as_text.slice(0, 4);console.log('获取为文本"', header_as_text );//" PNG"const as_text_binary = [ ...header_as_text ].map( char => char.charCodeAt( 0 ) );console.log( '哪个是二进制的', as_text_binary );//[ 65533, 80, 78, 61 ]console.log('实体编码',entityEncode(as_text_binary));//[ "U+FFDD", "U+0050", "U+004E", "U+0047" ]//它被读取为 UTF-8,我们丢失了第一个字节.})();函数 entityEncode( arr ) {return Array.from( arr ).map( val => 'U+' + toHex( val ) );}函数 toHex( num ) {返回 num.toString( 16 ).padStart(4, '0').toUpperCase();}


node.js 中本来就没有 Blob 对象,所以 axios 没有给它打猴子补丁是有道理的,这样它们就可以返回其他人无论如何都无法使用的响应.

在浏览器中,您会得到完全相同的响应:

function fetchAs( type ) {返回轴({方法:'获取',url: 'https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png',响应类型:类型});}函数加载图像(数据,类型){//我们都可以直接将它们传递给 Blob 构造函数const new_blob = new Blob( [数据], { type: 'image/jpg' } );//使用 blob: URI,浏览器将尝试按原样加载数据"const url = URL.createObjectURL( new_blob );img = document.getElementById( type + '_img' );img.src = url;返回新的承诺((res,rej)=> {img.onload = e =>资源(图像);img.onerror = rej;});}['json',//会失败'text',//会失败'数组缓冲区','斑点'].forEach( 类型 =>fetchAs( 类型 ).then( resp => loadImage( resp.data, type ) ).then( img => console.log( type, 'loaded' ) ).catch( err => console.error( type, 'failed' ) ));

<script src="https://unpkg.com/axios/dist/axios.min.js"></脚本><图><figcaption>json</figcaption><img id="json_img"></图><图><figcaption>文本</figcaption><img id="text_img"></图><图><figcaption>arraybuffer</figcaption><img id="arraybuffer_img"></图><图><figcaption>blob</figcaption><img id="blob_img"></figure>

I'm downloading a zip file with axios. For further processing, I need to get the "raw" data that has been downloaded. As far as I can see, in Javascript there are two types for this: Blobs and Arraybuffers. Both can be specified as responseType in the request options.

In a next step, the zip file needs to be uncompressed. I've tried two libraries for this: js-zip and adm-zip. Both want the data to be an ArrayBuffer. So far so good, I can convert the blob to a buffer. And after this conversion adm-zip always happily extracts the zip file. However, js-zip complains about a corrupted file, unless the zip has been downloaded with 'arraybuffer' as the axios responseType. js-zip does not work on a buffer that has been taken from a blob.

This was very confusing to me. I thought both ArrayBuffer and Blob are essentially just views on the underlying memory. There might be a difference in performance between downloading something as a blob vs buffer. But the resulting data should be the same, right ?

Well, I decided to experiment and found this:

If you specify responseType: 'blob', axios converts the response.data to a string. Let's say you hash this string and get hashcode A. Then you convert it to a buffer. For this conversion, you need to specify an encoding. Depending on the encoding, you will get a variety of new hashes, let's call them B1, B2, B3, ... When specifying 'utf8' as the encoding, I get back to the original hash A.

So I guess when downloading data as a 'blob', axios implicitly converts it to a string encoded with utf8. This seems very reasonable.

Now you specify responseType: 'arraybuffer'. Axios provides you with a buffer as response.data. Hash the buffer and you get a hashcode C. This code does not correspond to any code in A, B1, B2, ...

So when downloading data as an 'arraybuffer', you get entirely different data?

It now makes sense to me that the unzipping library js-zip complains if the data is downloaded as a 'blob'. It probably actually is corrupted somehow. But then how is adm-zip able to extract it? And I checked the extracted data, it is correct. This might only be the case for this specific zip archive, but nevertheless surprises me.

Here is the sample code I used for my experiments:

//typescript import syntax, this is executed in nodejs
import axios from 'axios';
import * as crypto from 'crypto';

axios.get(
    "http://localhost:5000/folder.zip", //hosted with serve
    { responseType: 'blob' }) // replace this with 'arraybuffer' and response.data will be a buffer
    .then((response) => {
        console.log(typeof (response.data));

        // first hash the response itself
        console.log(crypto.createHash('md5').update(response.data).digest('hex'));

        // then convert to a buffer and hash again
        // replace 'binary' with any valid encoding name
        let buffer = Buffer.from(response.data, 'binary');
        console.log(crypto.createHash('md5').update(buffer).digest('hex'));
        //...

What creates the difference here, and how do I get the 'true' downloaded data?

解决方案

From axios docs:

// `responseType` indicates the type of data that the server will respond with
// options are: 'arraybuffer', 'document', 'json', 'text', 'stream'
//   browser only: 'blob'
responseType: 'json', // default

'blob' is a "browser only" option.

So from node.js, when you set responseType: "blob", "json"will actually be used, which I guess fallbacks to "text" when no parse-able JSON data has been fetched.

Fetching binary data as text is prone to generate corrupted data. Because the text returned by Body.text() and many other APIs are USVStrings (they don't allow unpaired surrogate codepoints ) and because the response is decoded as UTF-8, some bytes from the binary file can't be mapped to characters correctly and will thus be replaced by � (U+FFDD) replacement character, with no way to get back what that data was before: your data is corrupted.

Here is a snippet explaining this, using the header of a .png file 0x89 0x50 0x4E 0x47 as an example.

(async () => {

  const url = 'https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png';
  // fetch as binary
  const buffer = await fetch( url ).then(resp => resp.arrayBuffer());

  const header = new Uint8Array( buffer ).slice( 0, 4 );
  console.log( 'binary header', header ); // [ 137, 80, 78, 61 ]
  console.log( 'entity encoded', entityEncode( header ) );
  // [ "U+0089", "U+0050", "U+004E", "U+0047" ]
  // You can read more about  (U+0089) character here
  // https://www.fileformat.info/info/unicode/char/0089/index.htm
  // You can see in the left table how this character in UTF-8 needs two bytes (0xC2 0x89)
  // We thus can't map this character correctly in UTF-8 from the UTF-16 codePoint,
  // it will get discarded by the parser and converted to the replacement character
  
  // read as UTF-8 
  const utf8_str = await new Blob( [ header ] ).text();
  console.log( 'read as UTF-8', utf8_str ); // "�PNG"
  // build back a binary array from that string
  const utf8_binary = [ ...utf8_str ].map( char => char.charCodeAt( 0 ) );
  console.log( 'Which is binary', utf8_binary ); // [ 65533, 80, 78, 61 ]
  console.log( 'entity encoded', entityEncode( utf8_binary ) );
  // [ "U+FFDD", "U+0050", "U+004E", "U+0047" ]
  // You can read more about character � (U+FFDD) here
  // https://www.fileformat.info/info/unicode/char/0fffd/index.htm
  //
  // P (U+0050), N (U+004E) and G (U+0047) characters are compatible between UTF-8 and UTF-16
  // For these there is no encoding lost
  // (that's how base64 encoding makes it possible to send binary data as text)
  
  // now let's see what fetching as text holds
  const fetched_as_text = await fetch( url ).then( resp => resp.text() );
  const header_as_text = fetched_as_text.slice( 0, 4 );
  console.log( 'fetched as "text"', header_as_text ); // "�PNG"
  const as_text_binary = [ ...header_as_text ].map( char => char.charCodeAt( 0 ) );
  console.log( 'Which is binary', as_text_binary ); // [ 65533, 80, 78, 61 ]
  console.log( 'entity encoded', entityEncode( as_text_binary ) );
  // [ "U+FFDD", "U+0050", "U+004E", "U+0047" ]
  // It's been read as UTF-8, we lost the first byte.
  
})();

function entityEncode( arr ) {
  return Array.from( arr ).map( val => 'U+' + toHex( val ) );
}
function toHex( num ) {
  return num.toString( 16 ).padStart(4, '0').toUpperCase();
}


There is natively no Blob object in node.js, so it makes sense axios didn't monkey-patch it just so they can return a response no-one else would be able to consume anyway.

From a browser, you'd have exactly the same responses:

function fetchAs( type ) {
  return axios( {
    method: 'get',
    url: 'https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png',
    responseType: type
  } );
}

function loadImage( data, type ) {
  // we can all pass them to the Blob constructor directly
  const new_blob = new Blob( [ data ], { type: 'image/jpg' } );
  // with blob: URI, the browser will try to load 'data' as-is
  const url = URL.createObjectURL( new_blob );
  
  img = document.getElementById( type + '_img' );
  img.src = url;
  return new Promise( (res, rej) => { 
    img.onload = e => res(img);
    img.onerror = rej;
  } );
}

[
  'json', // will fail
  'text', // will fail
  'arraybuffer',
  'blob'
].forEach( type =>
  fetchAs( type )
   .then( resp => loadImage( resp.data, type ) )
   .then( img => console.log( type, 'loaded' ) )
   .catch( err => console.error( type, 'failed' ) )
);

<script src="https://unpkg.com/axios/dist/axios.min.js"></script>

<figure>
  <figcaption>json</figcaption>
  <img id="json_img">
</figure>
<figure>
  <figcaption>text</figcaption>
  <img id="text_img">
</figure>
<figure>
  <figcaption>arraybuffer</figcaption>
  <img id="arraybuffer_img">
</figure>
<figure>
  <figcaption>blob</figcaption>
  <img id="blob_img">
</figure>

这篇关于axios 如何将 blob 与 arraybuffer 作为 responseType 处理?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆