Ravendb MapReduce按多个字段分组 [英] Ravendb mapreduce grouping by multiple fields

查看:220
本文介绍了Ravendb MapReduce按多个字段分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一个包含流式视频的网站,我们希望显示最近一周,一个月和一年(滚动窗口)中观看次数最多的视频的三个报告.

We have a site that contains streaming video and we want to display three reports of most watched videos in the last week, month and year (a rolling window).

每次观看视频时,我们都会在ravendb中存储一个文档:

We store a document in ravendb each time a video is watched:

public class ViewedContent
{
    public string Id { get; set; }
    public int ProductId { get; set; }
    public DateTime DateViewed { get; set; }
}

我们在弄清楚如何定义最能支持生成这三个报告的索引/mapreduce时遇到了麻烦.

We're having trouble figuring out how to define the indexes / mapreduces that would best support generating those three reports.

我们尝试了以下地图/缩小.

We have tried the following map / reduce.

public class ViewedContentResult
{
    public int ProductId { get; set; }
    public DateTime DateViewed { get; set; }
    public int Count { get; set; }
}

public class ViewedContentIndex :
        AbstractIndexCreationTask<ViewedContent, ViewedContentResult>
{
    public ViewedContentIndex()
    {
        Map = docs => from doc in docs
                      select new
                                 {
                                     doc.ProductId,
                                     DateViewed = doc.DateViewed.Date,
                                     Count = 1
                                 };

        Reduce = results => from result in results
                            group result by result.DateViewed
                            into agg
                            select new
                                       {
                                           ProductId = agg.Key,
                                           Count = agg.Sum(x => x.Count)
                                       };
    }
}

但是,此查询引发错误:

But, this query throws an error:

var lastSevenDays = session.Query<ViewedContent, ViewedContentIndex>()
                .Where( x => x.DateViewed > DateTime.UtcNow.Date.AddDays(-7) );

错误:"DateViewed未建立索引"

Error: "DateViewed is not indexed"

最终,我们想查询以下内容:

Ultimately, we want to query something like:

var lastSevenDays = session.Query<ViewedContent, ViewedContentIndex>()
                .Where( x => x.DateViewed > DateTime.UtcNow.Date.AddDays(-7) )
                .GroupBy( x => x.ProductId )
                .OrderBy( x => x.Count )

这实际上并没有编译,因为OrderBy是错误的. Count在这里不是有效的属性.

This doesn't actually compile, because the OrderBy is wrong; Count is not a valid property here.

这里的任何帮助将不胜感激.

Any help here would be appreciated.

推荐答案

如果您位于SQL领域,则每个报告都是不同的GROUP BY,它告诉您需要三个索引-一个仅包含月份的索引,一个包含条目的索引每周一次,每月一次,以及一年一次(或可能略有不同,具体取决于您实际执行查询的方式.

Each report is a different GROUP BY if you're in SQL land, that tells you that you need three indexes - one with just the month, one with entries by week, one by month, and one by year (or maybe slightly different depending on how you're actually going to do the query.

现在,您那里有一个DateTime-这会带来一些问题-您实际上想要做的是索引DateTime的Year组件,日期时间的Month组件和Day组件的Day组件约会时间. (或者只选择其中的一两个,具体取决于您要生成的报告.

Now, you have a DateTime there - that presents some problems - what you actually want to do is index the Year component of the DateTime, the Month component of the date time and Day component of that date time. (Or just one or two of these depending on which report you want to generate.

我只是在这里对您的代码加引号,因此显然不会编译,但是:

I'm only para-quoting your code here so obviously it won't compile, but:

public class ViewedContentIndex :
    AbstractIndexCreationTask<ViewedContent, ViewedContentResult>
{
public ViewedContentIndex()
{
    Map = docs => from doc in docs
                  select new
                             {
                                 doc.ProductId,
                                 Day = doc.DateViewed.Day,
                                 Month = doc.DateViewed.Month,
                                 Year = doc.DateViewed.Year
                                 Count = 1
                             };

    Reduce = results => from result in results
                        group result by new {
                             doc.ProductId,
                             doc.DateViewed.Day,
                             doc.DateViewed.Month,
                             doc.DateViewed.Year
                        }
                        into agg
                        select new
                                   {
                                       ProductId = agg.Key.ProductId,
                                       Day = agg.Key.Day,
                                       Month = agg.Key.Month,
                                       Year = agg.Key.Year  
                                       Count = agg.Sum(x => x.Count)
                                   };
}

}

希望您能看到我正在尝试实现的目标-您希望分组中的所有组件都可以使用,因为它们使您的分组变得独一无二.

Hopefully you can see what I'm trying to achieve by this - you want ALL the components in your group by, as they are what make your grouping unique.

我不记得RavenDB是否允许您使用DateTimes来执行此操作,并且我还没有在计算机上安装它,因此无法验证这一点,但是理论保持不变.

I can't remember if RavenDB lets you do this with DateTimes and I haven't got it on this computer so can't verify this, but the theory remains the same.

所以,要重申

您想要按周+产品编号的索引来编制报告 您想要按月+产品编号的索引索引 您想要按年份+产品ID编制报告索引

You want an index for your report by week + product id You want an index for your report by month + product id You want an index for your report by year + product id

我希望这会有所帮助,对不起,我无法给您提供一个可编译的示例,由于缺乏乌鸦,这使它变得有些困难:-)

I hope this helps, sorry I can't give you a compilable example, lack of raven makes it a bit difficult :-)

这篇关于Ravendb MapReduce按多个字段分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆