Solr Facet具有逗号分隔值的多个单词 [英] Solr Facet Multiple Words with Comma Separated Values

查看:262
本文介绍了Solr Facet具有逗号分隔值的多个单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将数据从mysql提取到solr中.使用group_concat函数生成一个字段,该函数导致逗号分隔的字段列出事件的所有波段.当时我相信这是为一个事件存储多个乐队的最好方法.但是,我发现我无法针对所有事件进行此查询.

I'm pulling data into solr from mysql. One of the fields is generated using a group_concat function that results in a comma separated field that lists all the bands for an event. At the time I believe this was the best way to store multiple bands for one event. However, I'm finding that I cannot facet this query against all events.

我已将band字段设置为字符串,并将多值设置为true.

I've set the band field to string and multivalued to true.

<field name="bands" type="string" indexed="true" stored="true" multiValued="true"/>

当字符串像一个长字符串一样刻面时,结果与预期的一样.

The result is as expected where the string is faceted as one long string.

珍珠果酱,爱丽丝,尖叫的树木,无尽的东西",1, "Primus,Gaga,培根碎",1, 公鸡,翅膀,鸡腿,尾巴羽毛",1,

"Pearl Jam,Alice,Screaming Trees,Everclear",1, "Primus,Gaga,Bacon Bits",1, "Roosters,Wings,Drumsticks,Tail Feathers",1,

此方法的最大问题是,当字段类型为字符串时,它似乎不可搜索.似乎我需要创建一个重复的字段,该字段的类型为text_general进行搜索,并具有一个用于构面的字段.是吗?

The biggest problem with this approach is when the field type is string it appears to not be searchable. Seems like I need to create a duplicate field that is type text_general for searching and have one for faceting. Yes?

是否有一种方法可以为band字段声明定界符以正确地说明这一点,或者我的方法是否错误?

Is there a way to declare a delimiter for the band field to facet this properly, or is my approach wrong?

推荐答案

对字段进行标记不会解决您的方面问题,您可以使用单个乐队名称进行搜索并获得结果,但是方面会更加糟糕.基本规则是不要对用于构成构面的字段使用任何标记化或文本增强功能.

Tokenizing your field will not solve your facet problem, you will be able to search with a single band name and get results, but the facet will be even worse. The basic rule is to not use any tokenization or text enhance for field used to make facets.

使用multiValued字段是很好的,但实际上是将一个带有带区列表的单个值放入其中,因为查询将列表作为一个单列返回,该列表映射到Solr中相关字段的单个值.

It's good to use a multiValued field, but are actually putting into it a single value with a list of bands, because your query return that list as a single column that is mapped to a single value for the related field in Solr.

您可以保留group_concat输出并通过简单更改data-config.xml来解决问题,告诉Solr使用分隔符拆分这些乐队名称.看看 RegexTransformer 及其splitBy参数:

You can keep the group_concat output and solve your problem with a simple change to your data-config.xml, telling Solr to split those band names using a separator. Have a look at the RegexTransformer and its splitBy parameter:

splitBy :用于拆分String以获取多个值,并返回 值列表

splitBy : Used to split a String to obtain multiple values, returns a list of values

如果使用与group_concat相同的分隔符配置splitBy,则技巧完成了,您将拥有多个值,并且构面看起来也很好.

If you configure the splitBy with the same separator you're using for group_concat the trick is done, you'll have multiple values and your facet will look good.

这篇关于Solr Facet具有逗号分隔值的多个单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆