使用matplotlib的Mongodb数据统计可视化 [英] Mongodb data statistics visualization using matplotlib

查看:239
本文介绍了使用matplotlib的Mongodb数据统计可视化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用matplotlib从mongodb中的数据中获得可视化的统计信息,但是我现在使用的方式确实很奇怪.

I want to get visualized statistics from my data in mongodb using matplotlib, but the way I'm using now is really weird.

我查询mongodb 30次以获取每天的数据,这已经很慢而且很脏,尤其是当我从其他地方而不是在服务器上获取结果时.我想知道是否有更好/干净的方法来获取每小时,每天,每月和每年的统计数据?

I queried the mongodb 30 times for getting day-by-day data, which is already slow and dirty, especially when I'm getting the result from somewhere else instead of on the server. I wonder if there is a better/clean way to get hour-by-hour, day-by-day, month-by-month and year-by-year statistics?

这是我现在正在使用的一些代码(获取每日统计信息):

Here is some code I'm using now(get day-by-day statistics):

from datetime import datetime, date, time, timedelta
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
from my_conn import my_mongodb

t1 = []
t2 = []
today = datetime.combine(date.today(), time())
with my_mongodb() as m:
    for i in range(30):
        day = today - timedelta(days = i)
        t1 = [m.data.find({"time": {"$gte": day, "$lt": day + timedelta(days = 1)}}).count()] + t1
        t2 = [m.data.find({"deleted": 0, "time": {"$gte": day, "$lt": day + timedelta(days = 1)}}).count()] + t2

x = range(30)
N = len(x)

def format_date(x, pos=None):
    day = today - timedelta(days = (N - x - 1))
    return day.strftime('%m/%d')

plt.bar(range(len(t1)), t1, align='center', color="#4788d2") #All
plt.bar(range(len(t2)), t2, align='center', color="#0c3688") #Not-deleted

plt.xticks(range(len(x)), [format_date(i) for i in x], size='small', rotation=30)
plt.grid(axis = "y")

plt.show()

推荐答案

感谢@Blubber,我现在找到了一种更好的方法来使用Map/Reduce处理此目的.

Thanks to @Blubber, I've now found a way that is better to handle this purpose using Map/Reduce.

获取数据部分已被重写为:

The fetching data part has been re-written to:

from dateutil import parser
parse_time = lambda s: parser.parse(s, ignoretz = True)

func_map = """
function() {
    if (this.hasOwnProperty("time"))
        emit(this.time.getUTCFullYear() + "/" + (this.time.getUTCMonth() + 1) + "/" + this.time.getUTCDate(),
        {
            count: 1,
            not_deleted: (1 - this.deleted)
        });
}
"""

func_reduce = """
function(key, values) {
    var result = {count: 0, not_deleted: 0};

    values.forEach(function(value) {
        result.count += value.count;
        result.not_deleted += value.not_deleted;
    });

    return result;
}
"""

with my_mongo() as m:
    result = m.data.inline_map_reduce(func_map, func_reduce)
    dataset = {parse_time(day['_id']): day['value']['not_deleted'] for day in result}
    dataset2 = {parse_time(day['_id']): day['value']['count'] for day in result}

由于我是JS的新手,所以必须有一些更好的方法来编写这些JS函数:)

Since I'm quite new to JS, there must be some way better to write those JS functions :)

这篇关于使用matplotlib的Mongodb数据统计可视化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆