在NodeJS中将字符串从utf8转换为latin1 [英] Converting a string from utf8 to latin1 in NodeJS
问题描述
我使用的是Latin1编码的数据库,不能将其更改为UTF-8,这意味着我遇到某些应用程序数据的问题。我使用Tesseract OCR一个文件(tesseract以UTF-8编码),并尝试使用iconv-lite;但是,它创建一个缓冲区并将该缓冲区转换为一个字符串。但是再次,缓冲区到字符串转换不允许latin1编码。
I'm using a Latin1 encoded DB and can't change it to UTF-8 meaning that I run into issues with certain application data. I'm using Tesseract to OCR a document (tesseract encodes in UTF-8) and tried to use iconv-lite; however, it creates a buffer and to convert that buffer into a string. But again, buffer to string conversion does not allow "latin1" encoding.
我已经读了一堆问题/答案;然而,我所得到的只是设置客户端编码和类似的东西。
I've read a bunch of questions/answers; however, all I get is setting client encoding and stuff like that.
任何想法?
推荐答案
我已经找到一种将任何编码的文本文件转换为UTF8的方法
I've found a way to convert any encoded text file, to UTF8
var
fs = require('fs'),
charsetDetector = require('node-icu-charset-detector'),
iconvlite = require('iconv-lite');
/* Having different encodings
* on text files in a git repo
* but need to serve always on
* standard 'utf-8'
*/
function getFileContentsInUTF8(file_path) {
var content = fs.readFileSync(file_path);
var original_charset = charsetDetector.detectCharset(content);
var jsString = iconvlite.decode(content, original_charset.toString());
return jsString;
}
我也在这里提到: https://gist.github.com/jacargentina/be454c13fa19003cf9f48175e82304d5
也许你可以尝试这个,其中内容
应该是你的数据库缓冲区数据(在latin1编码中)
Maybe you can try this, where content
should be your database buffer data (in latin1 encoding)
这篇关于在NodeJS中将字符串从utf8转换为latin1的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!