Node.js HTTP客户端中的自动UTF-8编码 [英] Automatic UTF-8 encoding in Node.js HTTP client

查看:477
本文介绍了Node.js HTTP客户端中的自动UTF-8编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Node.js从远程主机加载XML内容。

There I am trying to load XML content from a remote host using Node.js.

问题是德语 umlaute(如ä)已损坏。像在浏览器中一样,这通常是一个简单的编码问题。但是,由于远程主机上的XML内容是用iso-8859-2编码的,因此我无法成功使信件恢复工作。

The problem is that German "umlaute" like "ä" are broken. Like in the browser this usually is a simple encoding problem. But since the XML content on the remote host is encoded in iso-8859-2" I had no success getting the letters back to work.

功能非常简单。我只是使用集成在Node.js中的默认HTTP客户端通过简单的get请求连接到远程主机。

The functionality is very simple. I simply use the default HTTP client integrated in Node.js to connect to a remote host with a simple get request.

一些环境事实:


  • 远程系统使用 iso-8859-2编码。

  • 当前在响应头中设置了编码。 / li>
  • response.onData(chunk)

  • 接收到的数据(大块)中的字符不可恢复
  • The remote system uses "iso-8859-2" encoding.
  • The encoding is currently set in the response header.
  • The characters are unrecoverable broken in the data (chunk) received by response.onData(chunk)

Node.js在默认的Debian服务器上的版本0.2上运行。

Node.js is running on version 0.2 on da default Debian server.

代码基于默认的httpClient,如Node.js文档中所述。

The code is based on the default httpClient like described in the Node.js documentation.

我尝试了以下操作:

response.defaultAsciiEncoding true/false
response.encoding = UFT-8/ascii

我使用了UTF-8编码器/ de编码器对块进行编码/解码。在失败之后,我尝试对整个响应主体进行编码/解码。

I used a UTF-8 encoder/decoder to encode/decode the chunk. After this failed I tried to encode/decode the whole response body.

我对使用缓冲区不是很熟悉,我想问题一定是在那个方向上。或者默认情况下,Node.js(或httpClient)根本无法处理其他编码类型,这是我的第二个猜测。在这种情况下,我需要使用我认为的网络库编写自己的HTTP客户端。我只想确保我不会走错方向:)

I am not very familiar with using buffers, and I guess the problem must be in that direction. Or Node.js (or the httpClient) simply can't handle other encoding types by default witch is my second guess. In this case I need to write my own HTTP client using the net lib I think. I just want to make sure I don't walk into the wrong direction :)

推荐答案

我在Node上快速戳了一下。 .js源代码,似乎svick是正确的:Node.js不支持ISO编码。但是,您可以以二进制流的形式获取响应,然后使用您自己的编码将其返回到浏览器,或者使用node-iconv(再次建议使用svick)。

I had a quick poke around the Node.js source and it seems like svick is right: Node.js doesn't support the ISO encoding. You can, however, get at the response as a binary stream and then either return it to the browser with your own encoding or use node-iconv (again as svick suggested).

这是一个小例子: http://gist.github.com/576884

这篇关于Node.js HTTP客户端中的自动UTF-8编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆