基于“行”的过滤在python大 pandas 中创建数据透视表后的数据 [英] Filtering based on the "rows" data after creating a pivot table in python pandas
问题描述
我有一组数据,我从一个SQL数据库中获取并读入一个熊猫数据框。结果df是大约250M行,每天都在增长。因此,我想转动桌子给我一个更小的桌子(几千行)。
I have a set of data that I'm getting from a SQL database and reading into a pandas dataframe. The resulting df is about 250M rows and growing everyday. Therefore, I'd like to pivot the table to give me a much much smaller table to work with (few thousand rows).
该表看起来像这样,但更大:
The table looks something like this but much bigger:
data
report_date item_id views category
0 2013-06-01 2 3 a
1 2013-06-01 2 2 b
2 2013-06-01 5 16 a
3 2013-06-01 2 4 c
4 2013-06-01 2 5 d
我想通过忽略类别列,只是通过日期和item_id获取视图的总和来缩小。
I'd like to make this much smaller by ignoring the "category" column and just getting a total for views by date and item_id.
我'这样做:
pivot = data.pivot_table(values=['views'], rows=['report_date','item_id'], aggfunc='sum')
views
report_date item_id
2013-06-01 2 14
2013-06-01 5 16
现在可以想象这个数据范围会更长,数月和数千个item_id。我想在2013-06-01和2013-06-10之间选择item_id = 2和report_date的总视图,或者选择这些行。
Now imagine this is much bigger with the data range going for months and thousands of item_id's. I'd like to select the total views for item_id = 2 and report_date between '2013-06-01' and '2013-06-10' or something along those lines.
我直接搜索了几个小时,但我看不到如何选择和/或过滤掉我的行(即report_date和item_id)部分的值。我只能在值部分(例如:视图)中过滤/选择数据。这个问题是相似的,最后,asker评论了同样的问题,但我从来没有回答过。我只想尝试并提请注意。
I've searched for several hours straight but I can't see how to select and/or filter off of values in my "rows" (i.e. report_date and item_id) section. I can only filter/select data in the "values" section (ex: views). This question is similar, and at the very end the asker commented the same question I'm asking but was never answered. I just wanted to try and draw attention to it.
我赞赏所有的帮助。这个网站和社区是绝对宝贵的。
I appreciated all the help. This site and the community have been absolutely invaluable.
推荐答案
你应该可以这样剪切:
In [11]: pivot.ix[('2013-06-01', 3):('2013-06-01', 6)]
Out[11]:
views
report_date item_id
2013-06-01 5 16
请参阅提前索引文档。
这篇关于基于“行”的过滤在python大 pandas 中创建数据透视表后的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!