弹性搜索的查询字符串中的符号 [英] Symbols in query-string for elasticsearch
问题描述
我正在尝试使用轮胎/弹性搜索来搜索属性。我正在使用空白分析器对偏差属性进行索引。这是创建索引的代码:
settings:analysis => {
:filter => {
:ngram_filter => {
:type => nGram,
:min_gram => 2,
:max_gram => 255
},
:offs_filter => {
:type => word_delimiter,
:type_table => ['$ => ALPHA']
}
},
:analyzer => {
:ngram_analyzer => {
:type => custom,
:tokenizer => standard,
:filter => [smallcase,ngram_filter]
},
:offs_analyzer => {
:type => custom,
:tokenizer => 空白,
:filter => [smallcase]
}
}
} do
mapping do
索引:id,:type => '整数'
[:设备,步骤,食谱,细节,描述] .each do | attribute |
索引属性,:type => 'string',:analyzer => 'ngram_analyzer'
end
索引:deviation,:analyzer => '空白'
end
end
搜索似乎工作正常查询字符串不包含特殊字符。例如 Bin X
将仅返回那些具有单词 Bin
AND X
。但是,搜索类似 Bin $
或 Bin%
的所有结果都显示单词 Bin
几乎忽略了符号(结果与符号在搜索结果中显示较高,结果没有)。
这是我的搜索方法已经创建了
def self.search(params)
tire.search(load:true)do
query {string#{params [:term] .downcase}:#{params [:query]},default_operator:AND}
size 1000
end
end
这里是我如何构建搜索表单:
< DIV>
<%= form_tag issues_path,:class => formtastic issue,方法:get get%>
< fieldset class =inputs>
< ol>
< li class =string input medium search query optional stringish inline>
<%opts = [Description,Detail,Deviation,Equipment,Recipe,Step]%>
<%= select_tag:term,options_for_select(opts,params [:term])%>
<%= text_field_tag:query,params [:query]%>
<%= submit_tagSearch,name:nil,class:btn%>
< / li>
< / ol>
< / fieldset>
<%end%>
< / div>
您可以清理查询字符串。这是一个消毒剂,适用于我尝试扔过的所有东西:
def sanitize_string_for_elasticsearch_string_query (str)
#转义特殊字符
#http://lucene.apache.org/core/old_versioned_docs/versions/2_9_1/queryparsersyntax.html#Escaping Special Characters
escaped_characters = Regexp.escape ('\\ + - & |!(){} [] ^〜* ?:')
str = str.gsub(/([#{escaped_characters}])/,'\\\ \\\1')
#AND,OR和NOT被lucene用作逻辑运算符。我们需要
#来逃避他们
['AND','OR','NOT']。
escaped_word = word.split('')。map {| char | \\#{char}} .join('')
str = str.gsub(/ \s * \b(#{word.upcase})\b\s * /,#{escaped_word})
end
#逃避奇数引号
quote_count = str.count''
str = str.gsub(/( 。*)(。*)/,'\1\\3')如果quote_count%2 == 1
str
end
params [:query] = sanitize_string_for_elasticsearch_string_query(params [:query])
I have "documents" (activerecords) with an attribute called deviations. The attribute has values like "Bin X" "Bin $" "Bin q" "Bin %" etc.
I am trying to use tire/elasticsearch to search the attribute. I am using the whitespace analyzer to index the deviation attribute. Here is my code for creating the indexes:
settings :analysis => {
:filter => {
:ngram_filter => {
:type => "nGram",
:min_gram => 2,
:max_gram => 255
},
:deviation_filter => {
:type => "word_delimiter",
:type_table => ['$ => ALPHA']
}
},
:analyzer => {
:ngram_analyzer => {
:type => "custom",
:tokenizer => "standard",
:filter => ["lowercase", "ngram_filter"]
},
:deviation_analyzer => {
:type => "custom",
:tokenizer => "whitespace",
:filter => ["lowercase"]
}
}
} do
mapping do
indexes :id, :type => 'integer'
[:equipment, :step, :recipe, :details, :description].each do |attribute|
indexes attribute, :type => 'string', :analyzer => 'ngram_analyzer'
end
indexes :deviation, :analyzer => 'whitespace'
end
end
The search seems to work fine when the query string contains no special characters. For example Bin X
will return only those records that have the words Bin
AND X
in them. However, searching for something like Bin $
or Bin %
shows all results that have the word Bin
almost ignoring the symbol (results with the symbol do show up higher in the search that results without).
Here is the search method I have created
def self.search(params)
tire.search(load: true) do
query { string "#{params[:term].downcase}:#{params[:query]}", default_operator: "AND" }
size 1000
end
end
and here is how I am building the search form:
<div>
<%= form_tag issues_path, :class=> "formtastic issue", method: :get do %>
<fieldset class="inputs">
<ol>
<li class="string input medium search query optional stringish inline">
<% opts = ["Description", "Detail","Deviation","Equipment","Recipe", "Step"] %>
<%= select_tag :term, options_for_select(opts, params[:term]) %>
<%= text_field_tag :query, params[:query] %>
<%= submit_tag "Search", name: nil, class: "btn" %>
</li>
</ol>
</fieldset>
<% end %>
</div>
You can sanitize your query string. Here is a sanitizer that works for everything that I've tried throwing at it:
def sanitize_string_for_elasticsearch_string_query(str)
# Escape special characters
# http://lucene.apache.org/core/old_versioned_docs/versions/2_9_1/queryparsersyntax.html#Escaping Special Characters
escaped_characters = Regexp.escape('\\+-&|!(){}[]^~*?:')
str = str.gsub(/([#{escaped_characters}])/, '\\\\\1')
# AND, OR and NOT are used by lucene as logical operators. We need
# to escape them
['AND', 'OR', 'NOT'].each do |word|
escaped_word = word.split('').map {|char| "\\#{char}" }.join('')
str = str.gsub(/\s*\b(#{word.upcase})\b\s*/, " #{escaped_word} ")
end
# Escape odd quotes
quote_count = str.count '"'
str = str.gsub(/(.*)"(.*)/, '\1\"\3') if quote_count % 2 == 1
str
end
params[:query] = sanitize_string_for_elasticsearch_string_query(params[:query])
这篇关于弹性搜索的查询字符串中的符号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!