带有阿拉伯语 slug 的 Railsfriendly_id [英] Rails friendly_id with arabic slug
问题描述
我的问题与此密切相关 Rails 友好 id 与非拉丁字符.按照那里的建议答案,我实施了一些不同的解决方案(我知道,这是原始的,但我只想在添加复杂行为之前确保它有效).
My question is closely related to this one Rails friendly id with non-Latin characters. Following the suggested answer there, I implemented a little bit different solution (I know, it's primitive, but I just want to make sure it works before adding complex behavior).
在我的用户模型中,我有:
In my user model I have:
extend FriendlyId
friendly_id :slug_candidates, :use => [:slugged]
def slug_candidates
[
[:first_name, :last_name],
[:first_name, :last_name, :uid]
]
end
def should_generate_new_friendly_id?
first_name_changed? || last_name_changed? || uid_changed? || super
end
def normalize_friendly_id(value)
ERB::Util.url_encode(value.to_s.gsub("\s","-"))
end
现在,当我通过浏览器将مرحبا"作为 :first_name 提交时,slug 值在数据库中设置为%D9%85%D8%B1%D8%AD%D8%A8%D8%A7-",即是我所期望的(除了尾随的-").
now when I submit "مرحبا" as :first_name through the browser, slug value is set to "%D9%85%D8%B1%D8%AD%D8%A8%D8%A7-" in the database, which is what I expect (apart from the trailing "-").
然而浏览器中显示的 url 看起来是这样的:http://localhost:3000/en/users/%25D9%2585%25D8%25B1%25D8%25AD%25D8%25A8%25D8%25A7-,即不是我想要的.有谁知道这些额外的 %25 来自哪里以及为什么?
However the url shown in the browser looks like this: http://localhost:3000/en/users/%25D9%2585%25D8%25B1%25D8%25AD%25D8%25A8%25D8%25A7- , which is not what I want. Does anyone know where these extra %25s are coming from and why?
推荐答案
与此同时,我走得更远了,所以我把我的解决方案放在这里,也许对其他人有帮助.url 中的 25s 似乎是在我的 slug 中对 '%' 进行 url_encoding 的结果.我不知道这是哪里发生的,但我修改了我的 normalize_friendly_id 函数,这样它就不再影响我了.这是:
Meanwhile, I came a bit further, so I put my solution here maybe it could be helpful for someone else. The 25s in the url seem to be the result of url_encoding the '%' in my slug. I don't know where this happens, but I modified my normalize_friendly_id function, so that it doesn't affect me anymore. Here it is:
def normalize_friendly_id(value)
sep = '-'
#strip out tashkeel etc...
parameterized_string = value.to_s.gsub(/[\u0610-\u061A\u064B-\u065F\u06D6-\u06DC\u06DF-\u06E8\u06EA-\u06ED]/,''.freeze)
# Turn unwanted chars into the separator
parameterized_string.gsub!(/[^0-9A-Za-zÀ-ÖØ-öø-ÿ\u0620-\u064A\u0660-\u0669\u0671-\u06D3\u06F0-\u06F9\u0751-\u077F]+/,sep)
unless sep.nil? || sep.empty?
re_sep = Regexp.escape(sep)
# No more than one of the separator in a row.
parameterized_string.gsub!(/#{re_sep}{2,}/, sep)
# Remove leading/trailing separator.
parameterized_string.gsub!(/^#{re_sep}|#{re_sep}$/, ''.freeze)
end
parameterized_string.downcase
end
对此的一些评论:
- 我只考虑了拉丁字母和阿拉伯字母
- 我决定,如果我允许在 url 中使用阿拉伯字符,那么保留friendly_id 转换行为就没有意义,例如ü"到ue",ö"到oe"等等.所以我在网址中留下了这些字符.
- 我还尝试保留可能不会在阿拉伯语中使用的字符,但会保留在其他使用阿拉伯字母的语言(例如波斯语或乌尔都语)中.我只会说阿拉伯语,所以我猜测了哪些字符在其他语言中可能被视为常规字符.例如,ڿ"是任何语言中的常规字符吗?我不知道,但我想很可能是这样.
- 再次,因为我说阿拉伯语,所以我从文本中去掉了Tashkil".我想说的是,没有 tashkil 的文本通常比带有 tashkil 的文本更容易阅读.但是,我不知道我是否应该处理其他语言中的一些类似内容.任何提示都非常感谢.
- 最后:添加另一个字母表就像向正则表达式添加适当的序列一样简单.只需要知道哪些字符应该被列入白名单.
感谢您提出任何意见或改进建议.
I appreciate any comments or improvement suggestions.
这篇关于带有阿拉伯语 slug 的 Railsfriendly_id的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!