如何从数据框中创建键的字典:column_name和value:python列中的唯一值 [英] How to create a dictionary of key : column_name and value : unique values in column in python from a dataframe

查看:136
本文介绍了如何从数据框中创建键的字典:column_name和value:python列中的唯一值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个key:value对的字典,其中key是数据框的列名称,而value将是一个包含该列中所有唯一值的列表.最终我希望能够过滤出key_value根据条件从字典中获取对.到目前为止,这是我能够做到的:

I am trying to create a dictionary of key:value pairs where key is the column name of a dataframe and value will be a list containing all the unique values in that column.Ultimately I want to be able to filter out the key_value pairs from the dict based on conditions. This is what I have been able to do so far:

for col in col_list[1:]:
    _list = []
    _list.append(footwear_data[col].unique())
    list_name = ''.join([str(col),'_list'])

product_list = ['shoe','footwear']
color_list = []
size_list = []

这里的产品,颜色,大小都是列名,字典键应相应命名,例如color_list等. 最终,我将需要访问字典中的每个key:value_list. 预期输出:

Here product,color,size are all column names and the dict keys should be named accordingly like color_list etc. Ultimately I will need to access each key:value_list in the dictionary. Expected output:

KEY              VALUE
color_list :    ["red","blue","black"]
size_list:  ["9","XL","32","10 inches"]

有人可以为此提供帮助吗?数据的快照已随附.

Can someone please help me regarding this?A snapshot of the data is attached.

推荐答案

使用DataFrame像这样:

import pandas as pd
df = pd.DataFrame([["Women", "Slip on", 7, "Black", "Clarks"], ["Women", "Slip on", 8, "Brown", "Clarcks"], ["Women", "Slip on", 7, "Blue", "Clarks"]], columns= ["Category", "Sub Category", "Size", "Color", "Brand"])

print(df)

输出:

  Category Sub Category  Size  Color    Brand
0    Women      Slip on     7  Black   Clarks
1    Women      Slip on     8  Brown  Clarcks
2    Women      Slip on     7   Blue   Clarks

您可以在映射DataFrame的列时将DataFrame转换为dict并创建新的dict,例如以下示例:

You can convert your DataFrame into dict and create your new dict when mapping the the columns of the DataFrame, like this example:

new_dict = {"color_list": list(df["Color"]), "size_list": list(df["Size"])}
# OR:
#new_dict = {"color_list": [k for k in df["Color"]], "size_list": [k for k in df["Size"]]}

print(new_dict)

输出:

{'color_list': ['Black', 'Brown', 'Blue'], 'size_list': [7, 8, 7]}

要具有唯一值,可以像以下示例一样使用set:

In order to have a unique values, you can use set like this example:

new_dict = {"color_list": list(set(df["Color"])), "size_list": list(set(df["Size"]))}
print(new_dict)

输出:

{'color_list': ['Brown', 'Blue', 'Black'], 'size_list': [8, 7]}

或者,就像@Ami Tavory在他的回答中所说的那样,为了从DataFrame中获得整个唯一的键和值,您可以简单地做到这一点:

Or, like what @Ami Tavory said in his answer, in order to have the whole unique keys and values from your DataFrame, you can simply do this:

new_dict = {k:list(df[k].unique()) for k in df.columns}
print(new_dict)

输出:

{'Brand': ['Clarks', 'Clarcks'],
 'Category': ['Women'],
 'Color': ['Black', 'Brown', 'Blue'],
 'Size': [7, 8],
 'Sub Category': ['Slip on']}

这篇关于如何从数据框中创建键的字典:column_name和value:python列中的唯一值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆