Solr索引多个值作为一个字段 [英] Solr Index multiple values AS one field

查看:93
本文介绍了Solr索引多个值作为一个字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想对实际上将四个值分组并且每个文档可以具有多个实例的双字段进行查询.所以我需要一个可以存储类似内容的字段

I want to do queries on double fields where four values are actually grouped and each document can have multiple instances of this. So what I need is a field where I can store something like this

<doc>
   <field name="id">id</field>
   <field name="valueGroup">1 2 3 4</field>
   <field name="valueGroup">5 6 7 8</field>
</doc>

然后以这种方式进行范围查询:valueGroup:[0,0,0,0到3,8,8,8].我无法将其索引为具有multivalued ="true"的单个字段,因为每个组都需要单独处理.我知道有一个LatLon字段类型,但是只有两个值.如何获得二维以上的字段?

And then do ranged queries in this way: valueGroup: [0,0,0,0 to 3,8,8,8]. I cannot Index this as single fields with multivalued="true" because each group needs to be treated separately. I know there is a fieldtype LatLon but that has only two values. How to get fields with more than 2 dimensions?

推荐答案

正如我在您对我的SO问题的评论的答复中所提到的那样,对于执行某些复杂的过滤,我也有相当小的需求.最终,我必须创建一个自定义字段类,该类允许我重写负责返回包含自定义逻辑以过滤结果的查询对象的方法.这种方法应该非常适合您:

As I mentioned in a response to your comment on my SO question, I also had quite niche requirements for performing some complex filtering. Eventually, I had to create a custom field class which allowed me to override the method responsible for returning a query object containing the custom logic to filter results. This method should suit you perfectly:

public class MyCustomFieldType extends FieldType {
    /**
     * {@inheritDoc}
     */
    @Override
    protected void init(final IndexSchema schema, final Map<String, String> args) {
        trueProperties |= TOKENIZED;
        super.init(schema, args);
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public void write(final XMLWriter xmlWriter, final String name, final Fieldable fieldable)
        throws IOException
    {
        xmlWriter.writeStr(name, fieldable.stringValue());
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public void write(final TextResponseWriter writer, final String name, final Fieldable fieldable)
        throws  IOException
    {
        writer.writeStr(name, fieldable.stringValue(), true);
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public SortField getSortField(final SchemaField field, final boolean reverse) {
        return getStringSort(field, reverse);
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public void setAnalyzer(final Analyzer analyzer) {
        this.analyzer = analyzer;
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public void setQueryAnalyzer(final Analyzer queryAnalyzer) {
        this.queryAnalyzer = queryAnalyzer;
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public Query getFieldQuery(
        final QParser parser, final SchemaField field, final String externalVal)
    {
        // Do some parsing of the user's input (if necessary) from the query string (externalVal)
        final String parsedInput = ...

        // Instantiate your custom filter, taking note to wrap it in a caching wrapper!
        final Filter filter = new CachingWrapperFilter(
            new MyCustomFilter(field, parsedValue));

        // Return a query that runs your filter against all docs in the index
        // NOTE: depending on your needs, you may be able to do a more fine grained query here
        // instead of a MatchAllDocsQuery!!
        return new FilteredQuery(new MatchAllDocsQuery(), filter);
    }
}

现在您需要自定义过滤器...

Now you need a custom Filter...

public class MyCustomFilter extends Filter {
    /**
     * The field that is being filtered.
     */
    private final SchemaField field;

    /**
     *  The value to filter against.
     */
    private final String filterBy;

    /**
     * 
     *
     * @param field     The field to perform filtering against.
     * @param filterBy  A value to filter by.
     */
    public ProgrammeAvailabilityFilter(
        final SchemaField field,
        final String filterBy)
    {
        this.field = field;
        this.filterBy = filterBy;
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public DocIdSet getDocIdSet(final IndexReader reader) throws IOException {

        final FixedBitSet bitSet = new FixedBitSet(reader.maxDoc());

        // find all the docs you want to run the filter against
        final Weight weight = new IndexSearcher(reader).createNormalizedWeight(
            new SOME_QUERY_TYPE_HERE());

        final Scorer docIterator = weight.scorer(reader, true, false);

        if (docIterator == null) {
            return bitSet;
        }

        int docId;

        while ((docId = docIterator.nextDoc()) != Scorer.NO_MORE_DOCS) {

            final Document doc = reader.document(docId);

            for (final String indexFieldValue : doc.getValues(field.getName())) {
                // CUSTOM LOGIC GOES HERE

                // If your criteria are met, consider the doc a match
                bitSet.set(docId);
            }
        }

        return bitSet;
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public boolean equals(final Object other) {
        // NEEDED FOR CACHING
    }

    /**
     * {@inheritDoc}
     */
    @Override
    public int hashCode() {
        // NEEDED FOR CACHING
    }
}

上面的示例显然是非常基础的,但是如果将其用作模板并进行调整以提高性能并添加自定义逻辑,则应该得到所需的内容.另外,请确保在过滤器中实现hashCodeequals方法,因为这些方法将用于缓存.在查询字符串中,您可以像这样提供fq参数:`?q = some query& fq = myfield:[0,0,0,0到3,8,8,8].

The example above is obviously very basic, but if you use it as a template and tweak to improve performance and add your custom logic, you should get what you need. Also be sure to implement the hashCode and equals methods in your filter, as these will be used for caching. In the query string, you can supply the fq param like so: `?q=some query&fq=myfield:[0,0,0,0 to 3,8,8,8].

正如我提到的那样,这种方法对我和我的团队非常有用,因为我们对内容过滤有非常具体的要求.

As I mentioned, this approach worked great for me and my team, as we had quite specific requirements around the filtering of our content.

祝你好运. :)

这篇关于Solr索引多个值作为一个字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆