根据列值获取一部分数据框 [英] Taking a proportion of a dataframe based on column values

查看：64 发布时间：2020/5/24 4:18:20 python pandas numpy

本文介绍了根据列值获取一部分数据框的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个大约有50,000行的Pandas数据框，我想根据多种条件从该数据框中随机选择一定比例的行.具体来说，我有一列称为使用类型"，对于该列中的每个字段，我想选择不同比例的行.

I have a Pandas dataframe with ~50,000 rows and I want to randomly select a proportion of rows from that dataframe based on a number of conditions. Specifically, I have a column called 'type of use' and, for each field in that column, I want to select a different proportion of rows.

例如:

df[df['type of use'] == 'housing'].sample(frac=0.2)

此代码返回所有以房屋"作为使用类型"的行的20％.问题是我不知道如何以惯用的"方式对其余字段执行此操作.我也不知道如何从这次采样中得到结果来形成一个新的数据框.

This code returns 20% of all the rows which have 'housing' as their 'type of use'. The problem is I do not know how to do this for the remaining fields in a way that is 'idiomatic'. I also do not know how I could take the result from this sampling to form a new dataframe.

推荐答案

您可以通过list(df['type of use'].unique())为列中的所有值创建唯一列表，并按如下所示进行迭代:

You can make a unique list for all the values in the column by list(df['type of use'].unique()) and iterate like below:

for i in list(df['type of use'].unique()):
    print(df[df['type of use'] == i].sample(frac=0.2))

或

i = 0 
while i < len(list(df['type of use'].unique())):
    df1 = df[(df['type of use']==list(df['type of use'].unique())[i])].sample(frac=0.2)
    print(df1.head())
    i = i + 1

要存储，您可以创建字典:

For storing you can create a dictionary:

dfs = ['df' + str(x) for x in list(df2['type of use'].unique())]
dicdf = dict()
i = 0 
while i < len(dfs):
    dicdf[dfs[i]] = df[(df['type of use']==list(df2['type of use'].unique())[i])].sample(frac=0.2)
    i = i + 1
print(dicdf)

这将打印数据帧的字典. 您可以打印想要查看的内容，例如住房样本:print (dicdf['dfhousing'])

This will print a dictionary of the dataframes. You can print what you like to see for example for housing sample : print (dicdf['dfhousing'])

这篇关于根据列值获取一部分数据框的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

根据列值获取一部分数据框 [英] Taking a proportion of a dataframe based on column values

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

根据列值获取一部分数据框 [英] Taking a proportion of a dataframe based on column values

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭