Ruby - 寻找某种“Regexp unescape”方法 [英] Ruby -- looking for some sort of "Regexp unescape" method

查看:195
本文介绍了Ruby - 寻找某种“Regexp unescape”方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一堆字符串,我想要存储非转义代码,例如,解释器显示



\\\ \\014\ \\000\ \\016smoothing\ \\011mean\ \\022color\ \\011zero @ \\016
但是我想要显示(检查时)
\014\\000\\016smoothing\ \011mean \\022color \\011zero @ \016



我想我可以使用正则表达式从每个连续的n个反斜杠中删除1个反斜杠,但是我没有很多正则表达式的经验,似乎应该是一个更优雅的方法。



例如,当我放置MyString 时,会显示我想要的输出,但是我不知道如何捕获



谢谢!

编辑添加上下文:我有这个被使用的类要组织/恢复一些东西,但是当我恢复一些旧的字符串时,它会吐出一个我所确定的类型错误,因为它们不是 - 由于一些莫名其妙的原因 - 存储为base64。他们似乎刚刚被转义,我不想要,因为尝试恢复它们类似地给出TypeError
TypeError:不兼容的元帅文件格式(不能读)
格式版本4.8需要; 92.48给定

因为Marshal查看字符串的第一个字符以确定格式。

  require'base64'
class MarshaledStuff< ActiveRecord :: Base

validates_presence_of:marshaled_obj

def contents
obj = self.marshaled_obj
返回Marshal.restore(Base64.decode64(obj))
end

def content =(newcontents)
self.marshaled_obj = Base64.encode64(Marshal.dump(newcontents))
end
end

编辑2:更改措辞 - 我以为他们是双重逃脱,但只是单身,逃过一劫。哎呀!

解决方案

如果您的字符串在打印时给出正确的输出,那么它们已经正确地转义了。您看到的额外的反斜杠可能是因为您在交互式解释器中显示它们,当您显示变量以使其不太明确时,会为您添加额外的反斜杠。

 > x 
=> \\
>放x
\
=> nil
> x.length
=> 1

请注意,即使看起来x包含两个反斜杠,字符串的长度也是一个。额外的反斜杠由解释器添加,并不是字符串的一部分。



如果您仍然认为有问题,请更具体地说明如何显示您在问题中提到的字符串。






编辑:在您的示例中,唯一需要转义的是八进制转义代码。你可以尝试这样:

  x = x.gsub(/ \\ [0-2] [0-7] {2} /){| c | c [1,3] .to_i(8).chr} 


I have a bunch of string with special escape codes that I want to store unescaped- eg, the interpreter shows

"\\014\"\\000\"\\016smoothing\"\\011mean\"\\022color\"\\011zero@\\016" but I want it to show (when inspected) as "\014\"\000\"\016smoothing\"\011mean\"\022color\"\011zero@\016"

What's the method to unescape them? I imagine that I could make a regex to remove 1 backslash from every consecutive n backslashes, but I don't have a lot of regex experience and it seems there ought to be a "more elegant" way to do it.

For example, when I puts MyString it displays the output I'd like, but I don't know how I might capture that into a variable.

Thanks!

Edited to add context: I have this class that is being used to marshal / restore some stuff, but when I restore some old strings it spits out a type error which I've determined is because they weren't -- for some inexplicable reason -- stored as base64. They instead appear to have just been escaped, which I don't want, because trying to restore them similarly gives the TypeError TypeError: incompatible marshal file format (can't be read) format version 4.8 required; 92.48 given because Marshal looks at the first characters of the string to determine the format.

require 'base64'
class MarshaledStuff < ActiveRecord::Base

  validates_presence_of :marshaled_obj

  def contents
    obj = self.marshaled_obj
    return Marshal.restore(Base64.decode64(obj))
  end

  def contents=(newcontents)
    self.marshaled_obj = Base64.encode64(Marshal.dump(newcontents))
  end
end

Edit 2: Changed wording -- I was thinking they were "double-escaped" but it was only single-escaped. Whoops!

解决方案

If your strings give you the correct output when you print them then they are already escaped correctly. The extra backslashes you see are probably because you are displaying them in the interactive interpreter which adds extra backslashes for you when you display variables to make them less ambiguous.

> x
=> "\\"
> puts x
\
=> nil
> x.length
=> 1

Note that even though it looks like x contains two backslashes, the length of the string is one. The extra backslash is added by the interpreter and is not really part of the string.

If you still think there's a problem, please be more specific about how you are displaying the strings that you mentioned in your question.


Edit: In your example the only thing that need unescaping are octal escape codes. You could try this:

x = x.gsub(/\\[0-2][0-7]{2}/){ |c| c[1,3].to_i(8).chr }

这篇关于Ruby - 寻找某种“Regexp unescape”方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆