如何判断一个字符是否是汉字 [英] How to determine if a character is a Chinese character

查看:31
本文介绍了如何判断一个字符是否是汉字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何使用ruby判断一个字符是否为汉字?

How to determine if a character is a Chinese character using ruby?

推荐答案

一篇关于 Ruby 编码的有趣文章:http://blog.grayproductions.net/articles/bytes_and_characters_in_ruby_18(这是一个系列的一部分 - 也请查看文章开头的目录)

An interesting article on encodings in Ruby: http://blog.grayproductions.net/articles/bytes_and_characters_in_ruby_18 (it's part of a series - check the table of contents at the start of the article also)

我以前没有用过汉字,但这似乎是unicode支持的列表:http://en.wikipedia.org/wiki/List_of_CJK_Unified_Ideographs.还要注意的是,这是一个统一的系统,包括日文和韩文(其中一些字符是共享的) - 不确定您是否可以区分哪些是中文.

I haven't used chinese characters before but this seems to be the list supported by unicode: http://en.wikipedia.org/wiki/List_of_CJK_Unified_Ideographs . Also take note that it's a unified system including Japanese and Korean characters (some characters are shared between them) - not sure if you can distinguish which are Chinese only.

我认为您可以通过在字符串 str 和索引为 n 的字符上调用它来检查它是否是 CJK 字符:

I think you can check if it's a CJK character by calling this on string str and character with index n:

def check_char(str, n)
  list_of_chars = str.unpack("U*")
  char = list_of_chars[n]
  #main blocks
  if char >= 0x4E00 && char <= 0x9FFF
    return true
  end
  #extended block A
  if char >= 0x3400 && char <= 0x4DBF
    return true
  end
  #extended block B
  if char >= 0x20000 && char <= 0x2A6DF
    return true
  end
  #extended block C
  if char >= 0x2A700 && char <= 0x2B73F
    return true
  end
  return false
end

这篇关于如何判断一个字符是否是汉字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆