Javascript->下载以ISO-8859-1/Latin1/Windows-1252编码的CSV文件 [英] Javascript -> Download CSV file encoded in ISO-8859-1 / Latin1 / Windows-1252
问题描述
我使用了一个小工具,可以从Amazon CSV订单数据中提取运输数据.到目前为止,它仍然有效.这是一个简单的JS Bin版本: http://output.jsbin.com/jarako
I have hacked together a small tool to extract shipping data from Amazon CSV order data. it works so far. here is a simple version as JS Bin: http://output.jsbin.com/jarako
要打印邮票/运输标签,我需要一个文件以上传到Deutsche Post和其他包裹服务.我使用了一个小的函数saveTextAsFile
,该函数在stackoverflow上找到.到目前为止一切都很好.在输出文本区域或下载的文件中没有错误显示特殊字符(äöüß...).
For printing stamps/shipping labels, I need a file for uploading to Deutsche Post and to other parcel services. I used a small function saveTextAsFile
which i found on stackoverflow. Everything good so far. No wrong displayed special characters (äöüß...) in the output textarea or downloaded files.
所有这些德国邮政/包裹服务站点仅接受latin1/iso-8859-1编码的文件进行上传.但是我下载的文件始终是utf-8.如果我将其上传,则所有特殊字符(äöüß...)都会出错.
All these german post / parcel services sites accept only latin1 / iso-8859-1 encoded files for upload. But my downloaded file is always utf-8. If i upload it, all special characters (äöüß...) go wrong.
我该如何更改?我仍然搜索了很多.我已经尝试过:
How can i change this? I still searched a lot. I have tried i.e.:
将工具的字符集设置为iso-8859-1:
Setting the charset of the tool to iso-8859-1:
<META http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
但是结果是:现在我在输出文本区域和下载的文件中仍然有错误的特殊字符.如果我将其上传到发布站点,我仍然会收到更多错误字符.另外,如果我在CODA编辑器中检查了编码,它仍然说下载的文件是UTF-8.
But the result is: Now I have wrong special characters still in the output textarea and in the downloaded file. If I upload it to the post site, I still get more wrong characters. Also if I check the encoding in CODA Editor it still says the downloaded file is UTF-8.
saveTextAsFile
函数使用var textFileAsBlob = new Blob([textToWrite], {type:'text/plain'});
.
function saveTextAsFile()
{
var textToWrite = $('#dataOutput').val();
var textFileAsBlob = new Blob([textToWrite], {type:'text/plain'});
var fileNameToSaveAs = "Brief.txt";
var downloadLink = document.createElement("a");
downloadLink.download = fileNameToSaveAs;
downloadLink.innerHTML = "Download File";
if (window.webkitURL != null)
{
// Chrome allows the link to be clicked
// without actually adding it to the DOM.
downloadLink.href = window.webkitURL.createObjectURL(textFileAsBlob);
}
else
{
// Firefox requires the link to be added to the DOM
// before it can be clicked.
downloadLink.href = window.URL.createObjectURL(textFileAsBlob);
downloadLink.onclick = destroyClickedElement;
downloadLink.style.display = "none";
document.body.appendChild(downloadLink);
}
downloadLink.click();
}
无论如何,当网站使用自身时,必须有一种以其他编码方式下载文件的方法.我从中下载CSV文件的Amazon网站是UTF-8编码的.但是,如果我在CODA中检查它,则从那里下载的CSV文件为Latin1(iso-8859-1)...
Anyhow, there have to be a way to download files in other encoding as the site uses itself. The Amazon site, where i download the CSV file from is UTF-8 encoded. But downloaded CSV file from there is Latin1 (iso-8859-1) if i check it in CODA...
推荐答案
滚动到真正的解决方案的更新!
因为我没有答案,所以搜索量越来越多.看来Javascript中没有解决方案.我用JavaScript生成的每个测试下载文件都是UTF-8编码的.看起来Javascript仅适用于UNICODE/UTF-8,或者(可能)仅在使用先前的HTTP传输再次传输数据时才适用其他编码.但是对于在客户端上运行的Javascript,则不会发生其他HTTP传输,因为数据仍在客户端上.
Because I got no answer, I have searched more and more. It looks like there is NO SOLUTION in Javascript. Every test download I'v made, which was generated in javascript was UTF-8 encoded. Looks like Javascript is only made for UNICODE / UTF-8 or an other encoding would (possibly) only apply if the data would be transported again using a former HTTP transport. But for a Javascript, which runs on the client no additional HTTP transport happens, because the data is still on the client..
我现在已经在服务器上构建了一个小的PHP脚本,通过GET或POST请求向其发送数据,这对我有所帮助.它将编码转换为latin1/ISO-8859-1并下载为文件.这是一个ISO-8859-1文件,带有正确编码的特殊字符,我可以将其上传到提到的邮政和包裹服务站点,一切看起来都很好.
I have helped me now with building a small PHP Script on my server, to which i send the Data via GET or POST request. It converters the encoding to latin1 / ISO-8859-1 and downloads it as file. This is a ISO-8859-1 file with correctly encoded special characters, which I can upload to the mentioned postal and parcel service sites and everything looks good.
latin-download.php :(将PHP文件本身也保存在ISO-8859-1中,使其正常工作非常重要!)
latin-download.php: (It is VERY IMPORTANT to save the PHP file itself also in ISO-8859-1, to make it work!!)
<?php
$decoded_a = urldecode($_REQUEST["a"]);
$converted_to_latin = mb_convert_encoding($decoded_a,'ISO-8859-1', 'UTF-8');
$filename = $_REQUEST["filename"];
header('Content-Disposition: attachment; filename="'.$filename.'"; content-type: text/plain; charset=iso-8859-1;');
echo $converted_to_latin;
?>
在我使用的javascript代码中:
in my javascript code i use:
<a id="downloadlink">Download File</a>
<script>
var mydata = "this is testdata containing äöüß";
document.getElementById("downloadlink").addEventListener("click", function() {
var mydataToSend = encodeURIComponent(mydata);
window.open("latin-download.php?a=" + mydataToSend + "&filename=letter-max.csv");
}, false);
</script>
要获取更多数据,您必须从GET切换到POST ...
for bigger amounts of data you have to switch from GET to POST...
更新2016年2月8日
半年后,现在我在PURE JAVASCRIPT中找到了解决方案.使用 inexorabletash/text-encoding .这是编码生活标准的填充.该标准包括对诸如latin1("windows-1252")之类的旧编码的解码,但是它禁止将编码转换为这些旧编码类型.因此,如果您使用浏览器实现的window.TextEncoder
功能,则它仅提供UTF编码.但是, polyfill解决方案提供了一种旧模式,该模式还允许将其编码为latin1之类的旧编码.
A half year later now i have found a solution in PURE JAVASCRIPT. Using inexorabletash/text-encoding. This is a polyfill for Encoding Living Standard. The standard includes decoding of old encodings like latin1 ("windows-1252"), but it forbids encoding into these old encoding types. So if you use the browser implemented window.TextEncoder
function it does offer only UTF encoding. BUT, the polyfill solution offers a legacy mode, which does ALLOW also encoding into old encodings like latin1.
我这样使用它:
<!DOCTYPE html>
<script>
// 'Copy' browser build in TextEncoder function to TextEncoderOrg (because it can NOT encode windows-1252, but so you can still use it as TextEncoderOrg() )
var TextEncoderOrg = window.TextEncoder;
// ... and deactivate it, to make sure only the polyfill encoder script that follows will be used
window.TextEncoder = null;
</script>
<script src="lib/encoding-indexes.js"></script> // needed to support encode to old encoding types
<script src="lib/encoding.js"></script> // encording polyfill
<script>
function download (content, filename, contentType) {
if(!contentType) contentType = 'application/octet-stream';
var a = document.createElement('a');
var blob = new Blob([content], {'type':contentType});
a.href = window.URL.createObjectURL(blob);
a.download = filename;
a.click();
}
var text = "Es wird ein schöner Tag!";
// Do the encoding
var encoded = new TextEncoder("windows-1252",{ NONSTANDARD_allowLegacyEncoding: true }).encode(text);
// Download 2 files to see the difference
download(encoded,"windows-1252-encoded-text.txt");
download(text,"utf-8-original-text.txt");
</script>
encoding-indexes.js文件大约有500kb,因为它包含所有编码表.由于我只需要Windows-1252编码,因此我删除了此文件中的其他编码.所以现在只剩下632个字节了.
The encoding-indexes.js file is about 500kb big, because it contains all the encoding tables. Because i need only windows-1252 encoding, for my use i have deleted the other encodings in this file. so now there are only 632 byte left.
这篇关于Javascript->下载以ISO-8859-1/Latin1/Windows-1252编码的CSV文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!