SOLR - 过滤器查询中的正则表达式 [英] SOLR - Regex in Filter query

查看:98
本文介绍了SOLR - 过滤器查询中的正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在 fq 中实现 Regex,但之前从未实现过.

I want to implement Regex in fq but never implemented it before.

我在属性中有以下值,字段类型为小写":Prop=company1@city1@state1@country1@资深分析化学家,芝加哥

I have the below value in a property and the fieldtype is "lowercase": Prop=company1@city1@state1@country1@senior analytical chemist, chicago

我想根据正则表达式过滤结果.正则表达式应与上述匹配,如果"company1@city1@state1@country1@"+ regex 匹配 chicagoanalytical 最后一个@ 符号之后的任何位置.

I want to filter the results based on the regex. The regex should match the above if "company1@city1@state1@country1@"+ regex to match chicago and analytical anywhere after last @ symbol.

我的要求是匹配最后一个@ 之前的确切值,然后使用正则表达式匹配剩余的字符串,因为我只想在最后一部分进行自由文本搜索.我无法将数据拆分为多列作为多值字段.

My requirement is to match the exact values before last @ and then use regex to match the remaining strings as I want to do free text search only on the last part. I cant split the data into multiple columns as its a multi-valued field.

我在代码中尝试了下面的正则表达式来匹配最后一个@ 之后的字符串.它在代码中运行良好,但不确定如何在 SOLR 中实现相同

I tried the below regex in the code to match the string after last @. It works fine in the code but not sure how to implement same in SOLR

/([^@]+(?=.*IL)(?=.*chicago)(?=.*analytical))/ig 

有人可以告诉我如何在 SOLR 中使用上述正则表达式吗?

Can someone please let me know how to use above regex with SOLR?

推荐答案

Solr 中的正则表达式是通过使用 q=field:/regex/ 搜索提供的.这假设有问题的字段类型是字符串字段(或至少是一个带有 KeywordTokenizer 的字段),因为匹配发生在令牌级别(如果您有一个已分析的字段,它可能会被拆分为单独的令牌并且不会匹配正则表达式).

Regular expressions in Solr is provided by searching with q=field:/regex/. This assumes that the field type in question is a string field (or at least a field with a KeywordTokenizer) as the matching happens on the token level (and if you have a analyzed field, it might be split into separate tokens and won't match the regex).

类似 q=field:/([^@]+(?=.*IL)(?=.*chicago)(?=.*analytical))/ 可以工作,但是/i/ 修饰符表示您不想关心大小写.我将使用带有 KeywordTokenizer 和 LowercaseFilter 的字段,然后使用小写正则表达式进行搜索:

Something like q=field:/([^@]+(?=.*IL)(?=.*chicago)(?=.*analytical))/ could work, but the /i/ modifier indicates that you don't want to care about casing. I'd use a field with a KeywordTokenizer and a LowercaseFilter, and then use a lowercase regex to search:

<analyzer>
    <tokenizer class="solr.KeywordTokenizerFactory"/>           
    <filter class="solr.LowerCaseFilterFactory" />
</analyzer>

并查询:

q=field:/([^@]+(?=.*il)(?=.*chicago)(?=.*analytical))/

这篇关于SOLR - 过滤器查询中的正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆