多字节字符是否干扰正则表达式中的终端字符？ [英] Does multibyte character interfere with end-line character within a regex?

查看：138 发布时间：2017/8/16 22:31:01 ruby regex encoding multibyte ruby-2.0

本文介绍了多字节字符是否干扰正则表达式中的终端字符？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

使用这个正则表达式：

  regex1 = / \z / 
  / pre> 
 
 以下字符串匹配：
 hello= 〜regex1＃=> 5 
こんにちは=〜regex1＃=> 5 
  
但使用这些正则表达式：
  regex2 = /＃$ /？\z / 
 regex3 = / \\\
？\z / 
  
他们显示差异：
 hello=〜regex2＃= > 5 
hello=〜regex3＃=> 5 
こんにちは=〜regex2＃=> nil 
こんにちは=〜regex3＃=> nil 
  
什么是干扰？字符串编码是UTF-8，操作系统是Linux（即 $ / 是\\\
 ）。多字节字符是否干扰 $ / ？如何？
解决方案
在 Ruby trunk ，这个问题现在被接受为一个bug。希望它会被修复。
 
 
 更新：Ruby trunk中已经发布了两个补丁。
 
With this regex:
regex1 = /\z/
the following strings match:
"hello" =~ regex1 # => 5
"こんにちは" =~ regex1 # => 5
but with these regexes:
regex2 = /#$/?\z/
regex3 = /\n?\z/
they show difference:
"hello" =~ regex2 # => 5
"hello" =~ regex3 # => 5
"こんにちは" =~ regex2 # => nil
"こんにちは" =~ regex3 # => nil
What is interfering? The string encoding is UTF-8, and the OS is Linux (i.e., $/ is "\n"). Are the multibyte characters interfering with $/? How?
 解决方案 
In Ruby trunk, the issue has now been accepted as a bug. Hopefully, it will be fixed.

Update: Two patches have been posted in Ruby trunk.

                        这篇关于多字节字符是否干扰正则表达式中的终端字符？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

多字节字符是否干扰正则表达式中的终端字符？ [英] Does multibyte character interfere with end-line character within a regex?

问题描述

相关文章

开发方法最新文章

热门教程

热门工具

登录关闭

多字节字符是否干扰正则表达式中的终端字符？ [英] Does multibyte character interfere with end-line character within a regex?

问题描述

相关文章

开发方法最新文章

热门教程

热门工具

登录 关闭

登录关闭