使用matplotlib的Mongodb数据统计可视化 [英] Mongodb data statistics visualization using matplotlib
问题描述
我想使用matplotlib从mongodb中的数据中获得可视化的统计信息,但是我现在使用的方式确实很奇怪.
I want to get visualized statistics from my data in mongodb using matplotlib, but the way I'm using now is really weird.
我查询mongodb 30次以获取每天的数据,这已经很慢而且很脏,尤其是当我从其他地方而不是在服务器上获取结果时.我想知道是否有更好/干净的方法来获取每小时,每天,每月和每年的统计数据?
I queried the mongodb 30 times for getting day-by-day data, which is already slow and dirty, especially when I'm getting the result from somewhere else instead of on the server. I wonder if there is a better/clean way to get hour-by-hour, day-by-day, month-by-month and year-by-year statistics?
这是我现在正在使用的一些代码(获取每日统计信息):
Here is some code I'm using now(get day-by-day statistics):
from datetime import datetime, date, time, timedelta
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
from my_conn import my_mongodb
t1 = []
t2 = []
today = datetime.combine(date.today(), time())
with my_mongodb() as m:
for i in range(30):
day = today - timedelta(days = i)
t1 = [m.data.find({"time": {"$gte": day, "$lt": day + timedelta(days = 1)}}).count()] + t1
t2 = [m.data.find({"deleted": 0, "time": {"$gte": day, "$lt": day + timedelta(days = 1)}}).count()] + t2
x = range(30)
N = len(x)
def format_date(x, pos=None):
day = today - timedelta(days = (N - x - 1))
return day.strftime('%m/%d')
plt.bar(range(len(t1)), t1, align='center', color="#4788d2") #All
plt.bar(range(len(t2)), t2, align='center', color="#0c3688") #Not-deleted
plt.xticks(range(len(x)), [format_date(i) for i in x], size='small', rotation=30)
plt.grid(axis = "y")
plt.show()
推荐答案
感谢@Blubber,我现在找到了一种更好的方法来使用Map/Reduce处理此目的.
Thanks to @Blubber, I've now found a way that is better to handle this purpose using Map/Reduce.
获取数据部分已被重写为:
The fetching data part has been re-written to:
from dateutil import parser
parse_time = lambda s: parser.parse(s, ignoretz = True)
func_map = """
function() {
if (this.hasOwnProperty("time"))
emit(this.time.getUTCFullYear() + "/" + (this.time.getUTCMonth() + 1) + "/" + this.time.getUTCDate(),
{
count: 1,
not_deleted: (1 - this.deleted)
});
}
"""
func_reduce = """
function(key, values) {
var result = {count: 0, not_deleted: 0};
values.forEach(function(value) {
result.count += value.count;
result.not_deleted += value.not_deleted;
});
return result;
}
"""
with my_mongo() as m:
result = m.data.inline_map_reduce(func_map, func_reduce)
dataset = {parse_time(day['_id']): day['value']['not_deleted'] for day in result}
dataset2 = {parse_time(day['_id']): day['value']['count'] for day in result}
由于我是JS的新手,所以必须有一些更好的方法来编写这些JS函数:)
Since I'm quite new to JS, there must be some way better to write those JS functions :)
这篇关于使用matplotlib的Mongodb数据统计可视化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!