带权重的 Pandas 样本 [英] Pandas sample with weights

查看：42 发布时间：2021/6/13 20:05:16 pandas sample

本文介绍了带权重的 Pandas 样本的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有 df 并且我想从中抽取一些关于某些变量分布的样本.假设 df['type'].value_counts(normalize=True) 返回:

I have df and I'd like to make some sampling from it with respect to distribution of some variable. Let's say df['type'].value_counts(normalize=True) returns:

0.3 A
0.5 B
0.2 C

我想做类似 sampledf = df.sample(weights=df['type'].value_counts()) 的东西，这样 sampledf ['type'].value_counts(normalize=True) 将返回几乎相同的分布.如何在此处按频率传递 dict?

I'd like to make something like sampledf = df.sample(weights=df['type'].value_counts()) such that sampledf ['type'].value_counts(normalize=True) will return almost the same distridution. How to pass dict with frequency here?

推荐答案

Weights 必须采用与原始df长度相同的系列，所以最好将其添加为一列:

Weights has to take a series of the same length as the original df, so best is to add it as a column:

df['freq'] = df.groupby('type')['type'].transform('count')
sampledf = df.sample(weights = df.freq)

或者不添加列:

sampledf = df.sample(weights = df.groupby('type')['type'].transform('count'))

这篇关于带权重的 Pandas 样本的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

带权重的 Pandas 样本 [英] Pandas sample with weights

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

带权重的 Pandas 样本 [英] Pandas sample with weights

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭