有没有办法在ruby 1.9从字符串中删除无效的字节序列？ [英] Is there a way in ruby 1.9 to remove invalid byte sequences from strings?

查看：149 发布时间：2016/11/19 13:05:10 ruby encoding character-encoding ruby-1.9 utf

本文介绍了有没有办法在ruby 1.9从字符串中删除无效的字节序列？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设你有一个字符串€foo\xA0，编码的UTF-8，有没有办法从此字符串中删除无效的字节序列？（所以你得到€foo）

在ruby-1.8你可以使用 Iconv.iconv（'UTF-8 // IGNORE'，'UTF-8'，€foo\xA0）但现在已弃用。 €foo\xA0.encode（'UTF-8'）不做任何事情，因为它已经是UTF-8。我试过：

 foo \xA0.force_encoding（'BINARY'）。encode（'UTF-8' ：undef =>：replace，：replace =>''）

foo

但也会丢失有效的多字节字符

解决方案

 €foo\xA0.chars.select（& ：valid_encoding？）。join

Suppose you have a string like "€foo\xA0", encoded UTF-8, Is there a way to remove invalid byte sequences from this string? ( so you get "€foo" )

In ruby-1.8 you could use Iconv.iconv('UTF-8//IGNORE', 'UTF-8', "€foo\xA0") but that is now deprecated. "€foo\xA0".encode('UTF-8') doesn't do anything, since it is already UTF-8. I tried:

"€foo\xA0".force_encoding('BINARY').encode('UTF-8', :undef => :replace, :replace => '')

which yields

"foo"

But that also loses the valid multibyte character €

解决方案

"€foo\xA0".chars.select(&:valid_encoding?).join

这篇关于有没有办法在ruby 1.9从字符串中删除无效的字节序列？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

有没有办法在ruby 1.9从字符串中删除无效的字节序列？ [英] Is there a way in ruby 1.9 to remove invalid byte sequences from strings?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

有没有办法在ruby 1.9从字符串中删除无效的字节序列？ [英] Is there a way in ruby 1.9 to remove invalid byte sequences from strings?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭