在 Ruby 中将 Unicode 数字转换为整数 [英] Convert Unicode Number to Integer in Ruby

查看:42
本文介绍了在 Ruby 中将 Unicode 数字转换为整数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

不幸的是,我有一些数字作为字符串使用非 ASCII 数字输入.我需要将它们转换为常规的 Ruby 数字以对它们进行一些数学运算.因此,例如,如果数字作为字符串۱۹"进来,它是 19 但作为字符扩展阿拉伯印度数字一"后跟扩展阿拉伯印度数字九",我需要一种方法将其转换为Ruby 整数 Fixnum 19.

I have some numbers coming in as strings using non-ASCII digits unfortunately. I need to convert them to regular Ruby numbers to do some math on them. So for example if the number-as-a-string "۱۹" comes in, which is 19 but as the characters "extended arabic indic digit one" followed by "extended arabic indic digit nine", I need a way to convert that to the Ruby integer Fixnum 19.

问题是,据此,有55 组 0-9 这些扩展数字,即我需要处理的总共 550 个代码点.

The problem is, according to this, there are 55 groups of 0-9 of these extended digits, i.e. 550 total codepoints I need to handle.

我已经知道对于给定的组,连续数字的代码点是连续的,因此例如扩展阿拉伯印度数字 0 是 U+06F0,扩展阿拉伯印度数字 9 是 U+06F9,所以我可以测试每个数字查看它在哪个范围内,然后从我正在查看的字符的代码点中减去作为整数的零代码点,得到常规的 Ruby 整数.例如,6F9 - 6F0 = 9(粗略地说,一旦它们被转换为它们的整数代码点).

I already know that for a given group, the codepoints for consecutive digits are contiguous, so for example extended arabic indic digit 0 is U+06F0 and extended arabic indic digit 9 is U+06F9, so I can test each digit to see which range it's in and then subtract the zero codepoint as an integer from the codepoint of the character I'm looking at, to give me the regular Ruby integer. For example, 6F9 - 6F0 = 9 (in rough terms, once they're converted to their integer code points).

但要做到这一点,我需要为这 55 个范围创建一个巨大的查找哈希,这需要大量输入.我想我可以将上面链接中的 HTML 表格翻译成 ruby​​ 地图,但这感觉很糟糕.

But to do this, I need to create a giant lookup hash for these 55 ranges and that's a lot of typing. I suppose I could translate the HTML table at the link above into a ruby map, but that feels hacky.

我已经知道了

"۱۹" =~ /[[:digit:]]+/

将是一个匹配项,但问题是如何将这些 Unicode 数字转换回常规的 Ruby 整数?"

will be a match, but the question is "How to turn those Unicode digits back into regular Ruby integers?"

必须有更好的方法!有什么想法吗?

There has to be a better way! Any ideas?

谢谢!

推荐答案

这相对轻松.

class DecimalToIntegerConverter
  altzeros = [0x06f0, 0xff10] # ... need all zeroes here
  @@digits = altzeros.flat_map { |z| ((z.chr(Encoding::UTF_8))..((z+9).chr(Encoding::UTF_8))).to_a }.join('')
  @@replacements = "0123456789" * altzeros.size
  def self.convert(str)
    str.tr(@@digits, @@replacements).to_i
  end
end

str = "۱۹ and 25?"
str.scan(/[[:digit:]]+/).map do |s|
  DecimalToIntegerConverter.convert(s)
end
# => [19, 25]

这篇关于在 Ruby 中将 Unicode 数字转换为整数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆