Ruby中的url_encode [英] url_encode in Ruby

查看：197 发布时间：2020/7/24 22:19:28 ruby urlencode

本文介绍了Ruby中的url_encode的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我阅读了 url_encode 的文档./p>

是否有一个表格，可以准确地告诉我使用url_encode将哪个字符编码成什么字符?

解决方案

ERB的

def url_encode(s, regex=%r[^a-zA-Z0-9_\-.]/)
  s.to_s.dup.force_encoding("ASCII-8BIT").gsub(regex) {
    sprintf("%%%02X", $&.unpack("C")[0])
  }
end

url_encode('pop', /./)
=> "%70%6F%70"

此外，Ruby的CGI和URI模块具有对URL进行编码，将受限字符转换为实体的功能，因此请不要忽视它们的功能.

例如，转义URL参数的字符:

CGI.escape('http://www.example.com')
=> "http%3A%2F%2Fwww.example.com"

CGI.escape('<body><p>foo</p></body>')
=> "%3Cbody%3E%3Cp%3Efoo%3C%2Fp%3E%3C%2Fbody%3E"

Ruby CGI的 escape 还使用一个小的正则表达式来确定应在URL中转义哪些字符.这是文档中方法的定义:

def CGI::escape(string)
  string.gsub(%r([^ a-zA-Z0-9_.-]+)/) do
    '%' + $1.unpack('H2' * $1.bytesize).join('%').upcase
  end.tr(' ', '+')
end

您还可以覆盖它并更改正则表达式，或者在重新定义方法时将其公开以供您自己使用:

def CGI::escape(string, escape_regex=%r([^ a-zA-Z0-9_.-]+)/)
  string.gsub(escape_regex) do
    '%' + $1.unpack('H2' * $1.bytesize).join('%').upcase
  end.tr(' ', '+')
end

URI.encode_www_form_component 也有类似的编码，唯一的区别是*和:

URI.encode_www_form_component('<p>foo</p>')
=> "%3Cp%3Efoo%3C%2Fp%3E"

而且，类似于覆盖CGI::escape，您可以覆盖URI.encode_www_form_component中的正则表达式:

def self.encode_www_form_component(str, regex=%r[^*\-.0-9A-Z_a-z]/)
  str = str.to_s
  if HTML5ASCIIINCOMPAT.include?(str.encoding)
    str = str.encode(Encoding::UTF_8)
  else
    str = str.dup
  end
  str.force_encoding(Encoding::ASCII_8BIT)
  str.gsub!(regex, TBLENCWWWCOMP_)
  str.force_encoding(Encoding::US_ASCII)
end

I read the documentation of url_encode.

Is there a table that tells me exactly which character is encoded to what, using url_encode?

解决方案

ERB's url_encode can be tweaked:

def url_encode(s)
  s.to_s.dup.force_encoding("ASCII-8BIT").gsub(%r[^a-zA-Z0-9_\-.]/) {
    sprintf("%%%02X", $&.unpack("C")[0])
  }
end

to:

def url_encode(s, regex=%r[^a-zA-Z0-9_\-.]/)
  s.to_s.dup.force_encoding("ASCII-8BIT").gsub(regex) {
    sprintf("%%%02X", $&.unpack("C")[0])
  }
end

url_encode('pop', /./)
=> "%70%6F%70"

In addition, Ruby's CGI and URI modules have the ability to encode URLs, converting restricted characters to entities, so don't overlook their offerings.

For instance, escaping characters for URL parameters:

CGI.escape('http://www.example.com')
=> "http%3A%2F%2Fwww.example.com"

CGI.escape('<body><p>foo</p></body>')
=> "%3Cbody%3E%3Cp%3Efoo%3C%2Fp%3E%3C%2Fbody%3E"

Ruby CGI's escape also uses a small regex to figure out which characters should be escaped in a URL. This is the method's definition from the documentation:

def CGI::escape(string)
  string.gsub(%r([^ a-zA-Z0-9_.-]+)/) do
    '%' + $1.unpack('H2' * $1.bytesize).join('%').upcase
  end.tr(' ', '+')
end

You also override that and change the regex, or expose it for your own use inside your redefinition of the method:

def CGI::escape(string, escape_regex=%r([^ a-zA-Z0-9_.-]+)/)
  string.gsub(escape_regex) do
    '%' + $1.unpack('H2' * $1.bytesize).join('%').upcase
  end.tr(' ', '+')
end

URI.encode_www_form_component also does a similar encoding, the only differences in characters are * and :

URI.encode_www_form_component('<p>foo</p>')
=> "%3Cp%3Efoo%3C%2Fp%3E"

And, similarly to overriding CGI::escape, you can override the regex in URI.encode_www_form_component:

def self.encode_www_form_component(str, regex=%r[^*\-.0-9A-Z_a-z]/)
  str = str.to_s
  if HTML5ASCIIINCOMPAT.include?(str.encoding)
    str = str.encode(Encoding::UTF_8)
  else
    str = str.dup
  end
  str.force_encoding(Encoding::ASCII_8BIT)
  str.gsub!(regex, TBLENCWWWCOMP_)
  str.force_encoding(Encoding::US_ASCII)
end

这篇关于Ruby中的url_encode的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Ruby中的url_encode [英] url_encode in Ruby

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Ruby中的url_encode [英] url_encode in Ruby

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭