使用Mapreduce for Java Appengine计算唯一用户 [英] Counting Unique Users using Mapreduce for Java Appengine

查看:91
本文介绍了使用Mapreduce for Java Appengine计算唯一用户的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在计算我的java appengine应用程序每天的唯一用户数。我决定使用mapreduce框架(mapreduce.appspot.com)为java appengine进行离线计算。我已经设法创建了一个map reduce作业,它贯穿了代表单个用户会话事件的所有实体。我也可以使用一个简单的计数器。我有几个问题:
$ b $ 1)如何为每个用户标识只增加一次计数器?我目前正在映射包含用户标识属性的实体,但其中许多实体可能包含相同的用户标识,所以如何计算一次?



2)一次我在这些计数器中存储了这些作业的结果 - 我如何将它们坚持到数据存储区?我在mapreduce的状态页面上看到了计数器的结果,但我希望这些结果自动持久保存到数据存储中。

想法?

最后,当他们完成Reduce功能时,我想象这整个任务将变得非常简单。


I'm trying to count the number of unique users per day on my java appengine app. I have decided to use the mapreduce framework (mapreduce.appspot.com) for java appengine to do this calculation offline. I've managed to create a map reduce job that goes through all of my entities which represent a single users session event. I can use a simple counter as well. I have several questions though:

1) How do I only increment a counter once for each user id? I am currently mapping over entities which contain a user id property but many of these entities may contain the same user id so how do I only count it once?

2) Once I have these results of the job stored in these counters - how can I persist them to the datastore? I see the results of the counters on the mapreduce's status page but I want these results automatically persisted to the datastore.

Ideas?

解决方案

I haven't actually used the MapReduce functionality yet, but my theoretical understanding is that you can write things to the datastore from within your mapper. You could create an Entity type called something like UniqueCount, and insert one entity every time your mapper sees an ID that it hasn't seen before. then you can count how many unique ID's you have. In fact, you can just update a counter every time you find a new unique entity. You may want to google "sharded counter" for hints on creating a counter in the datastore that can handle high throughput.

Eventually, when they finish the Reduce functionality, I imagine this whole task will become pretty trivial.

这篇关于使用Mapreduce for Java Appengine计算唯一用户的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆