如何将value_count输出分配给数据框 [英] How to assign a value_count output to a dataframe
问题描述
我正在尝试将value_count的输出分配给新的df.我的代码如下.
I am trying to assign the output from a value_count to a new df. My code follows.
import pandas as pd
import glob
df = pd.concat((pd.read_csv(f, names=['date','bill_id','sponsor_id']) for f in glob.glob('/home/jayaramdas/anaconda3/df/s11?_s_b')))
column_list = ['date', 'bill_id']
df = df.set_index(column_list, drop = True)
df = df['sponsor_id'].value_counts()
df.columns=['sponsor', 'num_bills']
print (df)
未将值计数分配给指定为"sponsor","num_bills"的列标题.我从print.head获得以下输出
The value count is not being assigned the column headers specified 'sponsor', 'num_bills'. I'm getting the following output from print.head
1036 426
791 408
1332 401
1828 388
136 335
Name: sponsor_id, dtype: int64
推荐答案
您的列长不匹配,您从csv中读取了3列,然后将它们的索引设置为2,您计算了value_counts,从而产生了一个Series将列值作为索引,将value_counts作为值,则需要reset_index
,然后覆盖列名:
your column length doesn't match, you read 3 columns from the csv and then set the index to 2 of them, you calculated value_counts which produces a Series with the column values as the index and the value_counts as the values, you need to reset_index
and then overwrite the column names:
df = df.reset_index()
df.columns=['sponsor', 'num_bills']
示例:
In [276]:
df = pd.DataFrame({'col_name':['a','a','a','b','b']})
df
Out[276]:
col_name
0 a
1 a
2 a
3 b
4 b
In [277]:
df['col_name'].value_counts()
Out[277]:
a 3
b 2
Name: col_name, dtype: int64
In [278]:
type(df['col_name'].value_counts())
Out[278]:
pandas.core.series.Series
In [279]:
df = df['col_name'].value_counts().reset_index()
df.columns = ['col_name', 'count']
df
Out[279]:
col_name count
0 a 3
1 b 2
这篇关于如何将value_count输出分配给数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!