Ruby 1.9 - 无效的多字节字符(utf-8) [英] Ruby 1.9 - Invalid multibyte character (utf-8)

查看:184
本文介绍了Ruby 1.9 - 无效的多字节字符(utf-8)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个只有这两行的ruby文件:

 #encoding:utf-8 
puts -

当我使用 ruby​​ test_enc.rb 它失败:

  test_enc.rb:2:无效的多字节字符(UTF-8)
test_enc .rb:2:未终止的字符串符合文件

我不知道如何正确指定字符代码 - (emdash),但vim告诉我它是 151,Hex 97,Octal 227 。它也像其他角色(如ã)失败,所以我怀疑它与该角色有关。
我在Windows XP上运行,我使用的ruby版本是:

  ruby​​ 1.9.1p430 2010-08-16修订版28998)[i386-mingw32] 

我觉得有一些非常明显的我在这里错过任何想法?



编辑:了解今天关于假设的宝贵教训 - 具体假设您的编辑器使用UTF-8,而无需实际检查。糟糕!



感谢您的快速准确回复!



再次编辑:为utf-8设置正确的vim变得太大了,与这个问题没有真正的关联,所以现在是一个

鉴于Ruby明确地提请您注意UTF-8,我强烈怀疑您没有实际写出UTF-8文件来开始。确保Vim(或您用于创建文件的任何文本编辑器)真的设置为写出UTF-8。



请注意,在UTF-8中,任何非ASCII字符将由多个字节表示,而不是从Vim诊断程序中描述的单个字节。我建议使用二进制文件编辑器(或转储,或任何)来真正显示文本文件中的内容。某些东西还没有一些先入为主的编码概念 - 甚至没有想到它是一个文本文件。



记事本可以让你写出来一个文件在UTF-8,所以你可能想尝试,只是为了看看会发生什么。 (我没有安装Ruby,否则我会尝试一下。)


I have a ruby file with only these two lines:

# encoding: utf-8
puts "—"

When I run it with ruby test_enc.rb it fails with:

test_enc.rb:2: invalid multibyte char (UTF-8)
test_enc.rb:2: unterminated string meets end of file

I don't know how to properly specify the character code of (emdash), but vim tells me it is 151, Hex 97, Octal 227. It fails the same way with other characters like ã as well, so I doubt it is related specifically to that character. I am running on Windows XP and the version of ruby I'm using is:

ruby 1.9.1p430 (2010-08-16 revision 28998) [i386-mingw32]

I feel like there is something very obvious I am missing here. Any ideas?

EDIT: Learned a valuable lesson about assumptions today - specifically assuming your editor IS using UTF-8 without actually checking it. Oops!

Thanks for the quick and accurate replies all!

EDIT AGAIN: The 'setting up vim properly for utf-8' grew too big and wasn't really relevant to this question, so it is now a separate question.

解决方案

Given that Ruby is explicitly calling your attention to UTF-8, I strongly suspect that you haven't actually written out a UTF-8 file to start with. Make sure that Vim (or whatever text editor you're using to create the file) is really set to write out UTF-8.

Note that in UTF-8, any non-ASCII character will be represented by multiple bytes, not a single byte as you've described from the Vim diagnostics. I'd recommend using a binary file editor (or dump, or whatever) to really show what's in the text file though. Something that doesn't already have some preconceived notion of the encoding - something that isn't even trying to think of it as a text file.

Notepad lets you write out a file in UTF-8, so you might want to try that just to see what happens. (I don't have Ruby installed myself, otherwise I'd try it for you.)

这篇关于Ruby 1.9 - 无效的多字节字符(utf-8)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆