pandas 中任意列表的笛卡尔积 [英] Cartesian product of arbitrary lists in pandas

查看:112
本文介绍了 pandas 中任意列表的笛卡尔积的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定任意数量的列表,我想生成一个 Pandas DataFrame 作为笛卡尔积.例如,给定:

a = [1, 2, 3]b = ['val1', 'val2']c = [100, 101]

我想最终得到一个 DataFrame,其中包含 abc 列,以及所有 3x2x2=12 组合.

pandas 中的笛卡尔积不同,我正在寻找提供两个以上输入的能力,我是不希望传递 DataFrame ,这将涉及将同一 DataFrame 中的值保存在一起,而不是采用它的组合.这个问题的答案可能不会与那个问题的答案重叠.

x 和 y 数组点的笛卡尔积转化为单个二维点数组不同,我正在寻找熊猫 DataFrame 结果,带有命名列,而不是二维 numpy 数组.

解决方案

基于这个答案对相关问题(两个 DataFrame 的笛卡尔积),这个函数接受一个列表字典并返回笛卡尔积:

def cartesian_product(d):index = pd.MultiIndex.from_product(d.values(), names=d.keys())返回 pd.DataFrame(index=index).reset_index()

示例:

cartesian_product({'a': [1, 2, 3],'b': ['val1', 'val2'],'c': [100, 101]})a b c0 1 val1 1001 1 val1 1012 1 val2 1003 1 val2 1014 2 val1 1005 2 val1 1016 2 val2 1007 2 val2 1018 3 val1 1009 3 val1 10110 3 val2 10011 3 val2 101

我已将此添加到我的 microdf.>

Given an arbitrary number of lists, I'd like to produce a pandas DataFrame as the Cartesian product. For example, given:

a = [1, 2, 3]
b = ['val1', 'val2']
c = [100, 101]

I'd like to end up with a DataFrame with columns a, b, and c, and all 3x2x2=12 combinations.

Unlike cartesian product in pandas, I'm looking for the ability to provide more than two inputs, and I am not looking to pass DataFrames, which would involve keeping values within the same DataFrame together rather than taking combinations of it. Answers to this question will likely not overlap with answers to that one.

Unlike Cartesian product of x and y array points into single array of 2D points, I'm seeking a pandas DataFrame result, with named columns, rather than a two-dimensional numpy array.

解决方案

Building on this answer to a related question (Cartesian product of two DataFrames), this function takes a dictionary of lists and returns the Cartesian product:

def cartesian_product(d):
    index = pd.MultiIndex.from_product(d.values(), names=d.keys())
    return pd.DataFrame(index=index).reset_index()

Example:

cartesian_product({'a': [1, 2, 3],
                   'b': ['val1', 'val2'],
                   'c': [100, 101]})
    a      b      c
0   1   val1    100
1   1   val1    101
2   1   val2    100
3   1   val2    101
4   2   val1    100
5   2   val1    101
6   2   val2    100
7   2   val2    101
8   3   val1    100
9   3   val1    101
10  3   val2    100
11  3   val2    101

I've added this to my microdf package.

这篇关于 pandas 中任意列表的笛卡尔积的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆