使用.map()在pandas DataFrame中有效创建其他列 [英] Efficiently creating additional columns in a pandas DataFrame using .map()
问题描述
我正在分析形状类似于以下示例的数据集.我有两种不同类型的数据( abc 数据和 xyz 数据):
I am analyzing a data set that is similar in shape to the following example. I have two different types of data (abc data and xyz data):
abc1 abc2 abc3 xyz1 xyz2 xyz3
0 1 2 2 2 1 2
1 2 1 1 2 1 1
2 2 2 1 2 2 2
3 1 2 1 1 1 1
4 1 1 2 1 2 1
我想创建一个为数据框中存在的每个 abc 列添加一个分类列的函数.使用列名列表和类别映射字典,我可以获得所需的结果.
I want to create a function that adds a categorizing column for each abc column that exists in the dataframe. Using lists of column names and a category mapping dictionary, I was able to get my desired result.
abc_columns = ['abc1', 'abc2', 'abc3']
xyz_columns = ['xyz1', 'xyz2', 'xyz3']
abc_category_columns = ['abc1_category', 'abc2_category', 'abc3_category']
categories = {1: 'Good', 2: 'Bad', 3: 'Ugly'}
for i in range(len(abc_category_columns)):
df3[abc_category_columns[i]] = df3[abc_columns[i]].map(categories)
print df3
最终结果:
abc1 abc2 abc3 xyz1 xyz2 xyz3 abc1_category abc2_category abc3_category
0 1 2 2 2 1 2 Good Bad Bad
1 2 1 1 2 1 1 Bad Good Good
2 2 2 1 2 2 2 Bad Bad Good
3 1 2 1 1 1 1 Good Bad Good
4 1 1 2 1 2 1 Good Good Bad
虽然最后的for
循环工作正常,但我觉得我应该使用Python的lambda
函数,但似乎无法弄清楚.
While the for
loop at the end works fine, I feel like I should be using Python's lambda
function, but can't seem to figure it out.
是否有更有效的方法来映射动态数量的 abc 类型的列?
Is there a more efficient way to map in a dynamic number of abc-type columns?
推荐答案
您可以使用 get
方法:
You can use applymap
with the dictionary get
method:
In [11]: df[abc_columns].applymap(categories.get)
Out[11]:
abc1 abc2 abc3
0 Good Bad Bad
1 Bad Good Good
2 Bad Bad Good
3 Good Bad Good
4 Good Good Bad
并将其放入指定的列:
In [12]: abc_categories = map(lambda x: x + '_category', abc_columns)
In [13]: abc_categories
Out[13]: ['abc1_category', 'abc2_category', 'abc3_category']
In [14]: df[abc_categories] = df[abc_columns].applymap(categories.get)
注意:您可以使用列表理解来相对有效地构建abc_columns
:
Note: you can construct abc_columns
relatively efficiently using a list comprehension:
abc_columns = [col for col in df.columns if str(col).startswith('abc')]
这篇关于使用.map()在pandas DataFrame中有效创建其他列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!