Julia数据框分组依据和数据透视表功能 [英] Julia Dataframe group by and pivot tables functions
问题描述
如何使用Julia数据框对表进行分组和透视?
How do you do group by and pivot tables with Julia Dataframes?
让我说我有数据框
using DataFrames
df =DataFrame(Location = [ "NY", "SF", "NY", "NY", "SF", "SF", "TX", "TX", "TX", "DC"],
Class = ["H","L","H","L","L","H", "H","L","L","M"],
Address = ["12 Silver","10 Fak","12 Silver","1 North","10 Fak","2 Fake", "1 Red","1 Dog","2 Fake","1 White"],
Score = ["4","5","3","2","1","5","4","3","2","1"])
,我想执行以下操作:
1)具有Location
和Class
的数据透视表,应输出
1) a pivot table with Location
and Class
which should output
Class H L M
Location
DC 0 0 1
NY 2 1 0
SF 1 2 0
TX 1 2 0
2)按位置"分组,并计算该组中应输出的记录数
2) group by "Location" and a count on the number of records in that group, which should output
Pop
DC 1
NY 3
SF 3
TX 3
推荐答案
您可以使用unstack
来获得大部分信息(DataFrame没有索引,因此Class必须保留列,而不是熊猫它应该是索引),这似乎是DataFrames.jl对pivot_table
的答案:
You can use unstack
to get you most of the way (DataFrames don't have an index so Class has to remain a column, rather than in pandas where it would be an Index), this seems to be DataFrames.jl's answer to pivot_table
:
julia> unstack(df, :Location, :Class, :Score)
WARNING: Duplicate entries in unstack.
4x4 DataFrames.DataFrame
| Row | Class | H | L | M |
|-----|-------|-----|-----|-----|
| 1 | "DC" | NA | NA | "1" |
| 2 | "NY" | "3" | "2" | NA |
| 3 | "SF" | "5" | "1" | NA |
| 4 | "TX" | "4" | "2" | NA |
我不确定您怎么在这里fillna
(取消堆叠没有此选项)...
I'm not sure how you fillna
here (unstack doesn't have this option)...
您可以通过nrows
(行数)方法使用by
进行分组:
You can do the groupby using by
with the nrows
(number of rows) method:
julia> by(df, :Location, nrow)
4x2 DataFrames.DataFrame
| Row | Location | x1 |
|-----|----------|----|
| 1 | "DC" | 1 |
| 2 | "NY" | 3 |
| 3 | "SF" | 3 |
| 4 | "TX" | 3 |
这篇关于Julia数据框分组依据和数据透视表功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!