过滤静态RavenDB映射/减少索引 [英] Filter a static RavenDB map/reduce index

查看:71
本文介绍了过滤静态RavenDB映射/减少索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

场景/上下文

  • RavenHQ上的Raven 2.0
  • 网络应用,因此首选异步

我的应用程序是一项调查应用程序.每个Survey都有一个Questions数组;相反,每个Submission(个人对调查的答复)都有一个Answers数组.

My application is a survey application. Each Survey has an array of Questions; and conversely, each Submission (an individual's response to a survey) has an array of Answers.

我有一个汇总所有答案的静态索引,以便我可以根据回答显示图表(例如,对于每个调查的每个问题,有多少人选择了每个选项).这些数据用于呈现例如饼图.该汇总索引(在此问题中讨论)给出每个调查每个问题的对象,以及每个选项的总和.

I have a static index that aggregates all answers so that I can display a chart based on the responses (e.g. for each question on each survey, how many people selected each option). These data are used to render, for example, a pie chart. This aggregation index (discussed in this question) basically gives an object per question per survey, with sums for each option.

问题

我想过滤这些汇总值.其中一些比较琐碎,因为它们是结果中的字段(例如,按SurveyIdQuestionId进行过滤).但是,我也想按提交日期(从元数据中)或LocationId过滤,这是各个Submissions中的字段,但显然不在汇总结果中.

I would like to filter these aggregated values. Some of them are trivial because they are fields in the result (e.g. filter by SurveyId or QuestionId). However, I'd also like to filter by Submission date (from metadata), or by LocationId, which are fields in the individual Submissions, but obviously not in the aggregation results.

换句话说,我需要能够向Raven询问要针对特定​​LocationId或本月提出的结果.

In other words, I need to be able to ask Raven about the results to question for a specific LocationId, or during this month.

课程

以下是单个提交的基本内容:

Here is what a single Submission basically looks like:

{
  "SurveyId": 1,
  "LocationId": 1,
  "Answers": [
    {
      "QuestionId": 1,
      "Values": [2,8,32],
      "Comment": null
    },
    {
      "QuestionId": 2,
      "Values": [4],
      "Comment": "Lorem ipsum"
    },
    ...more answers...
  ]
}

当前,这是汇总结果:

public class Result
{
    public int SurveyId { get; set; } // 1
    public int QuestionId { get; set; } // 1
    public int NumResponses { get; set; } // 576
    public int NumComments { get; set; } // 265
    public IList<KeyValuePair<int,int>> Values { get; set; } // [{Key:1, Value:264}, {Key:2, Value:163}, Key:4, Value:391}, ...]
}

这是汇总指数:

Map = submissions => 
    from submission in submissions
    from answer in submission.Answers
    select new
    {
        submission.SurveyId,
        answer.QuestionId,
        NumResponses = 1,
        NumComments = answer.Comment == null ? 0 : 1,
        Value = answer.Value.Select(x => new KeyValuePair<int, int>(x, 1))
    };

Reduce = results => 
    from result in results
    group result by new { result.SurveyId, result.QuestionId }
        into g
        select new Result
        {
            SurveyId = g.Key.SurveyId,
            QuestionId = g.Key.QuestionId,
            NumResponses = g.Sum(x => x.NumResponses),
            NumComments = g.Sum(x => x.NumComments),
            Value = g.SelectMany(x => x.Value)
                        .GroupBy(x => x.Key)
                        .Select(x => new KeyValuePair<int, int>(x.Key, x.Sum(y => y.Value)))
        };

从概念上讲,我倾向于将这些过滤器传递"到查询中,但是根据我的阅读,这是行不通的,因为索引值是在没有单独的提交日期或LocationIds的情况下异步索引(存储)的.

I'm inclined, conceptually, to "pass in" these filters to the query, but from what I've read, this will not work because the index values are indexed (stored) asynchronously without the individual Submission dates or LocationIds.

这是否意味着我需要创建所有答案的索引,然后让聚合索引查询这个新的AllAnswers索引,或者其他?我已经做了一些搜索,寻找一个索引查询另一个,但是没有运气.还是这是字段的用途?

Does this mean that I'll need to create an index of all answers, and then have the aggregation index query this new AllAnswers index, or something? I've done a little bit of searching for having one index query another, with no luck. Or is this what Fields are used for??

任何指导表示赞赏!

推荐答案

您当前使用的索引通过SurveyIdQuestionId将所有数据聚合在一起.如果要按日期或位置分类,则为新索引.您只需将所需的字段添加到地图,将它们包括在分组键中,然后将它们传递给结果.然后,您可以轻松地通过这些键进行查询.

The index you have currently aggregates all of the data together, by the SurveyId and QuestionId. If you want it broken out by date or location, those are new indexes. You would simply add the fields you wanted to the map, include them in the grouping key, and pass them through in the result. Then you can easily query by those keys.

当您使用不同的分组键时,就无法将其合并到单个索引中.您必须有多个索引.例如,我可以称您在Submission_TotalsBySurveyAndQuestion之上的索引,另一个索引可能是Submission_TotalsBySurveyAndQuestionPerLocation.

When you have different grouping keys, you can't consolidate that into a single index. You have to have multiple indexes. For example, I might call the index you have above Submission_TotalsBySurveyAndQuestion, and another index might be Submission_TotalsBySurveyAndQuestionPerLocation.

以这种方式考虑-现在,您可以在查询中包括与SurveyIdQuestionId相对的WhereOrderBy-因为这些是索引中的分组关键字.如果要按LocationId进行过滤或排序,则必须将其包括在内.

Think of it this way - right now, you can include a Where or OrderBy in your query that goes against the SurveyId or QuestionId - because those are grouping keys in your index. If you want to filter or sort by LocationId, that has to be included.

一个警告,你说:

我还想按提交日期(来自元数据)进行过滤

I'd also like to filter by Submission date (from metadata)

RavenDB在元数据中给您的唯一日期(默认情况下)是Last-Modified日期.对文档的任何编辑都将对此进行更新.因此,如果提交日期对您很重要,那么您应该将其保存在自己的财产中.

The only date RavenDB gives you in the metadata (by default) is the Last-Modified date. Any edit to the document will update this. So if the submission date is important to you, then you should probably keep it in your own property.

这篇关于过滤静态RavenDB映射/减少索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆