检索使用JavaScript二进制文件的内容，连接的base64 code将其和反德code将其使用Python [英] Retrieving binary file content using Javascript, base64 encode it and reverse-decode it using Python

查看：287 发布时间：2016/8/1 21:05:56 javascript python encoding xmlhttprequest base64

本文介绍了检索使用JavaScript二进制文件的内容，连接的base64 code将其和反德code将其使用Python的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想c。使用这个简单的内容使用 XMLHtt prequest （使用的是最新的Webkit）和基于64位带$ C $下载的二进制文件功能：

I'm trying to download a binary file using XMLHttpRequest (using a recent Webkit) and base64-encode its contents using this simple function:

function getBinary(file){
    var xhr = new XMLHttpRequest();  
    xhr.open("GET", file, false);  
    xhr.overrideMimeType("text/plain; charset=x-user-defined");  
    xhr.send(null);
    return xhr.responseText;
}

function base64encode(binary) {
    return btoa(unescape(encodeURIComponent(binary)));
}

var binary = getBinary('http://some.tld/sample.pdf');
var base64encoded = base64encode(binary);

作为一个方面说明，一切上面是标准的JavaScript的东西，包括 BTOA（）和连接codeURIComponent（）：<一href=\"https://developer.mozilla.org/en/DOM/window.btoa\">https://developer.mozilla.org/en/DOM/window.btoa

这工作pretty顺利，我甚至可以去code使用Javascript中的Base64内容：

This works pretty smoothly, and I can even decode the base64 contents using Javascript:

function base64decode(base64) {
    return decodeURIComponent(escape(atob(base64)));
}

var decodedBinary = base64decode(base64encoded);
decodedBinary === binary // true

现在，我要脱code为Base64恩codeD使用Python从而消耗一些JSON字符串来获得内容的 base64en codeD 字符串值。这天真是我做什么：

Now, I want to decode the base64-encoded contents using Python which consume some JSON string to get the base64encoded string value. Naively this is what I do:

import urllib
import base64
# ... retrieving of base64 encoded string through JSON
base64 = "77+9UE5HDQ……………oaCgA="
source_contents = urllib.unquote(base64.b64decode(base64))
destination_file = open(destination, 'wb')
destination_file.write(source_contents)
destination_file.close()

但得到的文件是无效的，看起来像操作的messaed了UTF-8编码或东西还是我不清楚。

But the resulting file is invalid, looks like the operation's messaed up with UTF-8, encoding or something which is still unclear to me.

如果我试图把它们在目标文件之前取消code UTF-8的内容，将引发一个错误：

If I try to decode UTF-8 contents before putting them in the destination file, an error is raised:

import urllib
import base64
# ... retrieving of base64 encoded string through JSON
base64 = "77+9UE5HDQ……………oaCgA="
source_contents = urllib.unquote(base64.b64decode(base64)).decode('utf-8')
destination_file = open(destination, 'wb')
destination_file.write(source_contents)
destination_file.close()

$ python test.py
// ...
UnicodeEncodeError: 'ascii' codec can't encode character u'\ufffd' in position 0: ordinal not in range(128)

作为一个侧面说明，这里的同一文件的两个文本重新presentations截图;左：原;右：从基于64位德codeD字符串创建的： http://cl.ly/0U3G34110z3c132O2e2x

是否存在已知的伎俩试图重新创建文件时来规避这些问题编码？你将如何实现这一目标自己呢？

Is there a known trick to circumvent these problems with encoding when attempting to recreating the file? How would you achieve this yourself?

任何帮助或暗示多少AP preciated：）

Any help or hint much appreciated :)

推荐答案

所以我回答我自己 - 和抱歉 - 但我认为这可能是有用的人已经灭失，因为我是;）

So I'm answering to myself — and sorry for that — but I think it might be useful for someone as lost as I was ;)

所以，你必须使用 ArrayBuffer 并设置的responseType 你的 XMLHtt prequest 对象实例的属性 arraybuffer 检索字节土生土长的阵列，它可以使用下列方便的功能转换为base64（发现有的，笔者可以在这里祝福）：

So you have to use ArrayBuffer and set the responseType property of your XMLHttpRequest object instance to arraybuffer for retrieving a native array of Bytes, which can be converted to base64 using the following convenient function (found there, author may be blessed here):

function base64ArrayBuffer(arrayBuffer) {
  var base64    = ''
  var encodings = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/'

  var bytes         = new Uint8Array(arrayBuffer)
  var byteLength    = bytes.byteLength
  var byteRemainder = byteLength % 3
  var mainLength    = byteLength - byteRemainder

  var a, b, c, d
  var chunk

  // Main loop deals with bytes in chunks of 3
  for (var i = 0; i < mainLength; i = i + 3) {
    // Combine the three bytes into a single integer
    chunk = (bytes[i] << 16) | (bytes[i + 1] << 8) | bytes[i + 2]

    // Use bitmasks to extract 6-bit segments from the triplet
    a = (chunk & 16515072) >> 18 // 16515072 = (2^6 - 1) << 18
    b = (chunk & 258048)   >> 12 // 258048   = (2^6 - 1) << 12
    c = (chunk & 4032)     >>  6 // 4032     = (2^6 - 1) << 6
    d = chunk & 63               // 63       = 2^6 - 1

    // Convert the raw binary segments to the appropriate ASCII encoding
    base64 += encodings[a] + encodings[b] + encodings[c] + encodings[d]
  }

  // Deal with the remaining bytes and padding
  if (byteRemainder == 1) {
    chunk = bytes[mainLength]

    a = (chunk & 252) >> 2 // 252 = (2^6 - 1) << 2

    // Set the 4 least significant bits to zero
    b = (chunk & 3)   << 4 // 3   = 2^2 - 1

    base64 += encodings[a] + encodings[b] + '=='
  } else if (byteRemainder == 2) {
    chunk = (bytes[mainLength] << 8) | bytes[mainLength + 1]

    a = (chunk & 64512) >> 10 // 64512 = (2^6 - 1) << 10
    b = (chunk & 1008)  >>  4 // 1008  = (2^6 - 1) << 4

    // Set the 2 least significant bits to zero
    c = (chunk & 15)    <<  2 // 15    = 2^4 - 1

    base64 += encodings[a] + encodings[b] + encodings[c] + '='
  }

  return base64
}

所以这里有一个工作code：

So here's a working code:

var xhr = new XMLHttpRequest();
xhr.open('GET', 'http://some.tld/favicon.png', false);
xhr.responseType = 'arraybuffer';
xhr.onload = function(e) {
    console.log(base64ArrayBuffer(e.currentTarget.response));
};
xhr.send();

这将记录的有效的连接的base64 codeD字符串重新presenting二进制文件的内容。

This will log a valid base64 encoded string representing the binary file contents.

的编辑：的对于没有旧的浏览器访问 ArrayBuffer 并具有 BTOA（）未能在编码的字符，这里是另一种方式来获得一个base64连接任何二进制的codeD版：

For older browsers not having access to ArrayBuffer and having btoa() failing on encoding characters, here's another way to get a base64 encoded version of any binary:

function getBinary(file){
    var xhr = new XMLHttpRequest();
    xhr.open("GET", file, false);
    xhr.overrideMimeType("text/plain; charset=x-user-defined");
    xhr.send(null);
    return xhr.responseText;
}

function base64Encode(str) {
    var CHARS = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
    var out = "", i = 0, len = str.length, c1, c2, c3;
    while (i < len) {
        c1 = str.charCodeAt(i++) & 0xff;
        if (i == len) {
            out += CHARS.charAt(c1 >> 2);
            out += CHARS.charAt((c1 & 0x3) << 4);
            out += "==";
            break;
        }
        c2 = str.charCodeAt(i++);
        if (i == len) {
            out += CHARS.charAt(c1 >> 2);
            out += CHARS.charAt(((c1 & 0x3)<< 4) | ((c2 & 0xF0) >> 4));
            out += CHARS.charAt((c2 & 0xF) << 2);
            out += "=";
            break;
        }
        c3 = str.charCodeAt(i++);
        out += CHARS.charAt(c1 >> 2);
        out += CHARS.charAt(((c1 & 0x3) << 4) | ((c2 & 0xF0) >> 4));
        out += CHARS.charAt(((c2 & 0xF) << 2) | ((c3 & 0xC0) >> 6));
        out += CHARS.charAt(c3 & 0x3F);
    }
    return out;
}

console.log(base64Encode(getBinary('http://www.google.fr/images/srpr/logo3w.png')));

希望这会帮助别人，因为它为我做的。

Hope this helps others as it did for me.

这篇关于检索使用JavaScript二进制文件的内容，连接的base64 code将其和反德code将其使用Python的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

检索使用JavaScript二进制文件的内容，连接的base64 code将其和反德code将其使用Python [英] Retrieving binary file content using Javascript, base64 encode it and reverse-decode it using Python

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

检索使用JavaScript二进制文件的内容，连接的base64 code将其和反德code将其使用Python [英] Retrieving binary file content using Javascript, base64 encode it and reverse-decode it using Python

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭