javascript,我们如何读取带有重音符号的本地文本文件? [英] javascript, how could we read a local text file with accent letters into it?
问题描述
我有一个疑问,因为我需要读取一个本地文件,并且我一直在研究一些线程,并且在大多数情况下,我看到了各种处理它的方法,
I have one doubt because I need to read a local file and I have been studying some threads, and I have seen various ways to handle it, in most of the cases there is an input file.
我需要直接通过代码加载它。
I would need to load it directly through code.
我研究了这个线程:
我能读懂。
令人惊讶的是,当我尝试拆分行和单词时,它显示: 替换重音字母。
The surprising part was when I tried to split the lines and words, it showed: � replacing accent letters.
我现在拥有的代码是:
myFileReader.js
myFileReader.js
function readTextFile(file) {
var rawFile = new XMLHttpRequest();
rawFile.open("GET", file, false);
rawFile.onreadystatechange = function () {
if (rawFile.readyState === 4) {
if (rawFile.status === 200 || rawFile.status == 0) {
allText = rawFile.responseText;
console.log('The complete text is', allText);
let lineArr = intoLines(allText);
let firstLineWords = intoWords(lineArr[0]);
let secondLineWords = intoWords(lineArr[1]);
console.log('Our first line is: ', lineArr[0]);
let atlas = {};
for (let i = 0; i < firstLineWords.length; i++) {
console.log(`Our ${i} word in the first line is : ${firstLineWords[i]}`);
console.log(`Our ${i} word in the SECOND line is : ${secondLineWords[i]}`);
atlas[firstLineWords[i]] = secondLineWords[i];
}
console.log('The atlas is: ', atlas);
let atlasJson = JSON.stringify(atlas);
console.log('Atlas as json is: ', atlasJson);
download(atlasJson, 'atlasJson.txt', 'text/plain');
}
}
};
rawFile.send(null);
}
function download(text, name, type) {
var a = document.getElementById("a");
var file = new Blob([text], {type: type});
a.href = URL.createObjectURL(file);
a.download = name;
}
function intoLines(text) {
// splitting all text data into array "\n" is splitting data from each new line
//and saving each new line as each element*
var lineArr = text.split('\n');
//just to check if it works output lineArr[index] as below
return lineArr;
}
function intoWords(lines) {
var wordsArr = lines.split('" "');
return wordsArr;
}
疑问是:怎么可能我们会处理那些带有重音元音的特殊字符吗?
The doubt is: how could we handle those special character which are the vowels with accent?
我问这个问题,因为即使在IDE中,如果我们将该txt文件以UTF-8格式加载,因此我更改为ISO-8859-1并加载良好。
I ask this, because even in the IDE thet interrogation marks appeared if we load the txt in UTF-8, so then I changed to ISO-8859-1 and it loaded well.
我也研究过:
阅读UTF-8使用Javascript从外部文件中提取特殊字符
此外,您能否解释一下是否存在在客户端javascript中加载文件的更短方法。例如,在Java中,有FileReader / FileWriter / BufferedWriter。
谢谢您的帮助!
推荐答案
听起来文件是用ISO-8859-1(或可能是非常相似的Windows-1252)编码的。
It sounds like the file is encoded with ISO-8859-1 (or possibly the very-similar Windows-1252).
没有BOM
我只能看到的解决方案是:
The only solutions I can see are:
-
使用(本地)服务器,并使其返回HTTP
Content-Type
标头,其编码被标识为字符集,例如Content-Type:文本/纯文本; encoding = ISO-8859-1
Use a (local) server and have it return the HTTP
Content-Type
header with the encoding identified as a charset, e.g.Content-Type: text/plain; encoding=ISO-8859-1
请改用UTF-8(例如,以ISO-8859在编辑器中打开文件-1,然后将其另存为UTF-8),因为这是默认编码对于XHR响应机构。
Use UTF-8 instead (e.g., open the file in an editor as ISO-8859-1, then save it as UTF-8 instead), as that's the default encoding for XHR response bodies.
这篇关于javascript,我们如何读取带有重音符号的本地文本文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!