在Pandas DF中取消虚拟变量的最有效方法 [英] Most efficient way to un-dummy variables in Pandas DF

查看：115 发布时间：2020/5/24 4:23:00 python pandas dataframe

本文介绍了在Pandas DF中取消虚拟变量的最有效方法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

因此在下面的屏幕截图中，我们有3个不同的能量站点，ID01，ID18和ID31.它们的格式为虚拟变量类型，出于可视化的目的，我只想创建一个我可以使用的名为站点"的列.您会看到我快速创建的循环，但这似乎效率很低.关于如何以最快的方式实现这一目标的任何指示?

So in the screenshot below, we have 3 different energy sites, ID01, ID18, and ID31. They're in a dummy variable type of format, and for visualization purposes I want to just create a column named 'Sites' that I can use. You'll see the loop I quickly made to do this, but it seems super inefficient. Any pointers on how to achieve this in the fastest way possible?

设置

data = pd.DataFrame([
    [1, 0, 0],
    [0, 1, 0],
    [0, 0, 1],
    [1, 0, 0],
    [0, 1, 0]
], columns=['ID01', 'ID18', 'ID31']).assign(A=1, B=2)

data

   ID01  ID18  ID31  A  B
0     1     0     0  1  2
1     0     1     0  1  2
2     0     0     1  1  2
3     1     0     0  1  2
4     0     1     0  1  2

`dot`具有字符串和对象的产品.

如果这些是真正的虚拟值0或1

`dot` product with strings and objects.

This works if these are truly dummy values 0 or 1

def undummy(d):
    return d.dot(d.columns)

data.assign(Site=data.filter(regex='^ID').pipe(undummy))

   ID01  ID18  ID31  A  B  Site
0     1     0     0  1  2  ID01
1     0     1     0  1  2  ID18
2     0     0     1  1  2  ID31
3     1     0     0  1  2  ID01
4     0     1     0  1  2  ID18

`argmax`切片

这可以工作，但是如果数据与问题不一样，则会产生意外结果.

`argmax` slicing

This works but can produce unexpected results if data is not as represented in question.

def undummy(d):
    return d.columns[d.values.argmax(1)]

data.assign(Site=data.filter(regex='^ID').pipe(undummy))

   ID01  ID18  ID31  A  B  Site
0     1     0     0  1  2  ID01
1     0     1     0  1  2  ID18
2     0     0     1  1  2  ID31
3     1     0     0  1  2  ID01
4     0     1     0  1  2  ID18

这篇关于在Pandas DF中取消虚拟变量的最有效方法的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在Pandas DF中取消虚拟变量的最有效方法 [英] Most efficient way to un-dummy variables in Pandas DF

问题描述

推荐答案

设置

`dot`具有字符串和对象的产品.

`dot` product with strings and objects.

`argmax`切片

`argmax` slicing

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在Pandas DF中取消虚拟变量的最有效方法 [英] Most efficient way to un-dummy variables in Pandas DF

问题描述

推荐答案

设置

dot具有字符串和对象的产品.

dot product with strings and objects.

argmax切片

argmax slicing

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

`dot`具有字符串和对象的产品.

`dot` product with strings and objects.

`argmax`切片

`argmax` slicing

登录关闭