如何使用ruby gsub Regexp与许多比赛? [英] How to use ruby gsub Regexp with many matches?

查看:216
本文介绍了如何使用ruby gsub Regexp与许多比赛?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的csv文件内容在引号内有双引号

 测试,第一行, man,谢谢
,第二,li,我的老板是你,好的


$ b b

我需要用

 替换前面或后面加上逗号的每个双引号。 ,you are akindman,thanks 
again,second,li,mybossyou,good



所以改为



我尝试了

  x.gsub(/([^,])([^,])/,#{$ 1} \\#{$ 2}) 

但不起作用

解决方案

如果引号出现在第一个值的开头或最后一个值的末尾,您的正则表达式需要更加粗体:

  csv =<< ENDCSV 
test,first,line,你是一个亲切的人,再次感谢
,我的老板你是,好的
更多,有人说你是可爱的,yay
注意这个,并且,这个测试用例
ENDCSV

put csv.gsub(/(?<!^ |,)(?!,| $)/,'')
#=>测试,第一,行,你是一个种人,谢谢
#=>再次,第二,李,我的老板是你,好
#=>更多,有人说你是可爱,yay
#=> 注意这个,以及这个测试用例

上面的正则表达式使用




  • (?<!^ | ) - 紧接在此点之前不得有开头行( ^ )或逗号

  • - 查找双引号

  • (?!,| $) - 紧接此位置后不得有逗号或行尾( $



作为奖励,由于你没有实际捕获任何一边的字符,你不需要担心正确使用 \1



有关详细信息,请参阅官方Ruby正则表达式文档






但是,对于您 需要替换输出中的匹配项,您可以使用以下任一项:

 hello.gsub /([aeiou])/,'< \1>'#=> h e ll o 
hello.gsub /([aeiou])/,< \\1> #=> h e ll o
hello.gsub(/([aeiou])/){| m | <#{$ 1}> }#=> h e ll o

您不能像替换字符串一样在替换字符串中使用字符串插值:

 hello.gsub /([aeiou])/,<#{$ 1}> 
#=> h< previousmatch> ll< previousmatch>

...因为字符串插值发生一次 c> gsub 。使用 gsub 的块形式重新调用每个匹配的块,此时全局 $ 1

:对于Ruby 1.8(为什么是全局变量):

你可以使用:

  puts csv.gsub(/([^,\\\
\r] )([^,\\\
\r])/,'\1\2)


I have csv file contents having double quotes inside quoted text

test,first,line,"you are a "kind" man",thanks
again,second,li,"my "boss" is you",good

I need to replace every double quote not preceded or succeeded by a comma by ""

test,first,line,"you are a ""kind"" man",thanks
again,second,li,"my ""boss"" is you",good

so " is replaced by ""

I tried

x.gsub(/([^,])"([^,])/, "#{$1}\"\"#{$2}")

but didn't work

解决方案

Your regex needs to be a little more bold, in case the quotes occur at the start of the first value, or at the end of the last value:

csv = <<ENDCSV
test,first,line,"you are a "kind" man",thanks
again,second,li,"my "boss" is you",good
more,""Someone" said that you're "cute"",yay
"watch out for this",and,also,"this test case"
ENDCSV

puts csv.gsub(/(?<!^|,)"(?!,|$)/,'""')
#=> test,first,line,"you are a ""kind"" man",thanks
#=> again,second,li,"my ""boss"" is you",good
#=> more,"""Someone"" said that you're ""cute""",yay
#=> "watch out for this",and,also,"this test case"

The above regex is using negative lookbehind and negative lookahead assertions (anchors) available in Ruby 1.9.

  • (?<!^|,) — immediately preceding this spot there must not be either a start of line (^) or a comma
  • " — find a double quote
  • (?!,|$) — immediately following this spot there must not be either a comma or end of line ($)

As a bonus, since you didn't actually capture the characters on either side, you don't need to worry about using \1 correctly in your replacement string.

For more information, see the section "Anchors" in the official Ruby regex documentation.


However, for the case where you do need to replace matches in your output, you can use any of the following:

"hello".gsub /([aeiou])/, '<\1>'            #=> "h<e>ll<o>"
"hello".gsub /([aeiou])/, "<\\1>"           #=> "h<e>ll<o>"
"hello".gsub(/([aeiou])/){ |m| "<#{$1}>" }  #=> "h<e>ll<o>"

You can't use String interpolation in the replacement string, as you did:

"hello".gsub /([aeiou])/, "<#{$1}>"
 #=> "h<previousmatch>ll<previousmatch>"

…because that string interpolation happens once, before the gsub has been run. Using the block form of gsub re-invokes the block for each match, at which point the global $1 has been appropriately populated and is available for use.


Edit: For Ruby 1.8 (why on earth are you using that?) you can use:

puts csv.gsub(/([^,\n\r])"([^,\n\r])/,'\1""\2')

这篇关于如何使用ruby gsub Regexp与许多比赛?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆