Ravendb mapreduce 按多个字段分组 [英] Ravendb mapreduce grouping by multiple fields

查看:25
本文介绍了Ravendb mapreduce 按多个字段分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一个包含流媒体视频的网站,我们希望显示上周、月和年(滚动窗口)中观看次数最多的视频的三份报告.

每次观看视频时,我们都会在 ravendb 中存储一个文档:

公共类 ViewedContent{公共字符串 ID { 获取;放;}公共 int ProductId { 获取;放;}公共日期时间日期查看{获取;放;}}

我们无法确定如何定义最能支持生成这三个报告的索引/mapreduce.

我们尝试了下面的map/reduce.

公共类 ViewedContentResult{公共 int ProductId { 获取;放;}公共日期时间日期查看{获取;放;}公共 int 计数 { 获取;放;}}公共类 ViewedContentIndex :AbstractIndexCreationTask<ViewedContent, ViewedContentResult>{公共查看内容索引(){地图=文档=>来自文档中的文档选择新的{doc.ProductId,DateViewed = doc.DateViewed.Date,计数 = 1};减少=结果=>从结果到结果按 result.DateViewed 分组结果进入聚合选择新的{ProductId = agg.Key,计数 = agg.Sum(x => x.Count)};}}

但是,这个查询会抛出一个错误:

var lastSevenDays = session.Query().Where(x => x.DateViewed > DateTime.UtcNow.Date.AddDays(-7) );

<块引用>

错误:DateViewed 未编入索引"

最终,我们想要查询如下内容:

var lastSevenDays = session.Query().Where(x => x.DateViewed > DateTime.UtcNow.Date.AddDays(-7)).GroupBy(x => x.ProductId).OrderBy(x => x.Count)

这实际上并没有编译,因为 OrderBy 是错误的;此处的 Count 属性无效.

如有任何帮助,我们将不胜感激.

解决方案

如果您在 SQL 领域,每个报告都是不同的 GROUP BY,它告诉您需要三个索引 - 一个仅包含月份,一个包含条目按周、按月和按年(或者可能略有不同,具体取决于您实际执行查询的方式.

现在,您在那里有一个 DateTime - 这会带来一些问题 - 您实际上想要做的是索引 DateTime 的 Year 组件、日期时间的 Month 组件和该 DateTime 的 Day 组件约会时间.(或者只是其中的一两个,具体取决于您要生成的报告.

我只是在这里引用您的代码,显然它不会编译,但是:

公共类 ViewedContentIndex :AbstractIndexCreationTask<ViewedContent, ViewedContentResult>{公共查看内容索引(){地图=文档=>来自文档中的文档选择新的{doc.ProductId,Day = doc.DateViewed.Day,月 = doc.DateViewed.Month,年 = doc.DateViewed.Year计数 = 1};减少=结果=>从结果到结果按新 { 分组结果doc.ProductId,doc.DateViewed.Day,doc.DateViewed.Month,doc.DateViewed.Year}进入聚合选择新的{ProductId = agg.Key.ProductId,Day = agg.Key.Day,月 = agg.Key.Month,年份 = agg.Key.Year计数 = agg.Sum(x => x.Count)};}

}

希望你能看到我想要通过这个实现的目标 - 你想要你的组中的所有组件,因为它们使你的分组独一无二.

我不记得 RavenDB 是否允许您使用 DateTimes 执行此操作,而我没有在这台计算机上安装它,因此无法验证这一点,但理论保持不变.

所以,重新迭代

您希望按周 + 产品 ID 为您的报告编制索引您希望按月 + 产品 ID 为您的报告编制索引您需要按年份 + 产品 ID 为您的报告编制索引

我希望这会有所帮助,抱歉我不能给你一个可编译的例子,缺少 raven 会有点困难:-)

We have a site that contains streaming video and we want to display three reports of most watched videos in the last week, month and year (a rolling window).

We store a document in ravendb each time a video is watched:

public class ViewedContent
{
    public string Id { get; set; }
    public int ProductId { get; set; }
    public DateTime DateViewed { get; set; }
}

We're having trouble figuring out how to define the indexes / mapreduces that would best support generating those three reports.

We have tried the following map / reduce.

public class ViewedContentResult
{
    public int ProductId { get; set; }
    public DateTime DateViewed { get; set; }
    public int Count { get; set; }
}

public class ViewedContentIndex :
        AbstractIndexCreationTask<ViewedContent, ViewedContentResult>
{
    public ViewedContentIndex()
    {
        Map = docs => from doc in docs
                      select new
                                 {
                                     doc.ProductId,
                                     DateViewed = doc.DateViewed.Date,
                                     Count = 1
                                 };

        Reduce = results => from result in results
                            group result by result.DateViewed
                            into agg
                            select new
                                       {
                                           ProductId = agg.Key,
                                           Count = agg.Sum(x => x.Count)
                                       };
    }
}

But, this query throws an error:

var lastSevenDays = session.Query<ViewedContent, ViewedContentIndex>()
                .Where( x => x.DateViewed > DateTime.UtcNow.Date.AddDays(-7) );

Error: "DateViewed is not indexed"

Ultimately, we want to query something like:

var lastSevenDays = session.Query<ViewedContent, ViewedContentIndex>()
                .Where( x => x.DateViewed > DateTime.UtcNow.Date.AddDays(-7) )
                .GroupBy( x => x.ProductId )
                .OrderBy( x => x.Count )

This doesn't actually compile, because the OrderBy is wrong; Count is not a valid property here.

Any help here would be appreciated.

解决方案

Each report is a different GROUP BY if you're in SQL land, that tells you that you need three indexes - one with just the month, one with entries by week, one by month, and one by year (or maybe slightly different depending on how you're actually going to do the query.

Now, you have a DateTime there - that presents some problems - what you actually want to do is index the Year component of the DateTime, the Month component of the date time and Day component of that date time. (Or just one or two of these depending on which report you want to generate.

I'm only para-quoting your code here so obviously it won't compile, but:

public class ViewedContentIndex :
    AbstractIndexCreationTask<ViewedContent, ViewedContentResult>
{
public ViewedContentIndex()
{
    Map = docs => from doc in docs
                  select new
                             {
                                 doc.ProductId,
                                 Day = doc.DateViewed.Day,
                                 Month = doc.DateViewed.Month,
                                 Year = doc.DateViewed.Year
                                 Count = 1
                             };

    Reduce = results => from result in results
                        group result by new {
                             doc.ProductId,
                             doc.DateViewed.Day,
                             doc.DateViewed.Month,
                             doc.DateViewed.Year
                        }
                        into agg
                        select new
                                   {
                                       ProductId = agg.Key.ProductId,
                                       Day = agg.Key.Day,
                                       Month = agg.Key.Month,
                                       Year = agg.Key.Year  
                                       Count = agg.Sum(x => x.Count)
                                   };
}

}

Hopefully you can see what I'm trying to achieve by this - you want ALL the components in your group by, as they are what make your grouping unique.

I can't remember if RavenDB lets you do this with DateTimes and I haven't got it on this computer so can't verify this, but the theory remains the same.

So, to re-iterate

You want an index for your report by week + product id You want an index for your report by month + product id You want an index for your report by year + product id

I hope this helps, sorry I can't give you a compilable example, lack of raven makes it a bit difficult :-)

这篇关于Ravendb mapreduce 按多个字段分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆