生成汇总(“枢轴"?)表 [英] Produce a summary ("pivot"?) table
问题描述
我想要一种汇总数据库表的方法,以便将共享公共 ID 的行汇总到一行输出中.
I'd like a way to summarise a database table so that rows sharing a common ID are summarised into one row of output.
我的工具是 SQLite 和 Python 2.x.
My tools are SQLite and Python 2.x.
例如,下面是我当地超市的水果价格表...
For example, given the following table of fruit prices at my local supermarkets...
+--------------------+--------------------+--------------------+
|Fruit |Shop |Price |
+--------------------+--------------------+--------------------+
|Apple |Coles |$1.50 |
|Apple |Woolworths |$1.60 |
|Apple |IGA |$1.70 |
|Banana |Coles |$0.50 |
|Banana |Woolworths |$0.60 |
|Banana |IGA |$0.70 |
|Cherry |Coles |$5.00 |
|Date |Coles |$2.00 |
|Date |Woolworths |$2.10 |
|Elderberry |IGA |$10.00 |
+--------------------+--------------------+--------------------+
... 我想制作一个汇总表,显示每个超市每种水果的价格.空格应由 NULL 填充.
... I want to produce a summary table showing me the price of each fruit at each supermarket. Blank spaces should be filled by NULLs.
+----------+----------+----------+----------+
|Fruit |Coles |Woolworths|IGA |
+----------+----------+----------+----------+
|Apple |$1.50 |$1.60 |$1.70 |
|Banana |$0.50 |$0.60 |$0.70 |
|Cherry |NULL |$5.00 |NULL |
|Date |$2.00 |$2.10 |NULL |
|Elderberry|NULL |NULL |$10.00 |
+----------+----------+----------+----------+
我相信文献将其称为数据透视表"或数据透视查询",但显然 SQLite 没有t 支持 PIVOT
.(该问题的解决方案使用硬编码的 LEFT JOIN
s.这对我来说并不真正吸引我,因为我不知道列"名称.)
I believe the literature calls this a "pivot table" or a "pivot query", but apparently SQLite doesn't support PIVOT
. (The solution in that question uses hardcoded LEFT JOIN
s. This doesn't really appeal to me because I don't know the "column" names in advance.)
现在我通过在 Python 中遍历整个表并累积 dicts
的 dict
来做到这一点,这有点笨拙.我对更好的解决方案持开放态度,无论是 Python 还是 SQLite,都会以表格形式提供数据.
Right now I do this by iterating through the entire table in Python and accumulating a dict
of dicts
, which is a bit klutzy. I am open to better solutions, either in Python or SQLite, that will give the data in tabular form.
推荐答案
在 python 方面,你可以使用一些 itertools 魔法来重新排列你的数据:
On python side, you could use some itertools magic for rearranging your data:
data = [('Apple', 'Coles', 1.50),
('Apple', 'Woolworths', 1.60),
('Apple', 'IGA', 1.70),
('Banana', 'Coles', 0.50),
('Banana', 'Woolworths', 0.60),
('Banana', 'IGA', 0.70),
('Cherry', 'Coles', 5.00),
('Date', 'Coles', 2.00),
('Date', 'Woolworths', 2.10),
('Elderberry', 'IGA', 10.00)]
from itertools import groupby, islice
from operator import itemgetter
from collections import defaultdict
stores = sorted(set(row[1] for row in data))
# probably splitting this up in multiple lines would be more readable
pivot = ((fruit, defaultdict(lambda: None, (islice(d, 1, None) for d in data))) for fruit, data in groupby(sorted(data), itemgetter(0)))
print 'Fruit'.ljust(12), '\t'.join(stores)
for fruit, prices in pivot:
print fruit.ljust(12), '\t'.join(str(prices[s]) for s in stores)
<小时>
输出:
Fruit Coles IGA Woolw
Apple 1.5 1.7 1.6
Banana 0.5 0.7 0.6
Cherry 5.0 None None
Date 2.0 None 2.1
Elderberry None 10.0 None
这篇关于生成汇总(“枢轴"?)表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!