从pandas.get_dummies转换到新数据的简单方法? [英] Easy way to apply transformation from `pandas.get_dummies` to new data?

查看:225
本文介绍了从pandas.get_dummies转换到新数据的简单方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个数据框data,其中包含要转换为指标的字符串.我使用pandas.get_dummies(data)将其转换为现在可用于构建模型的数据集.

Suppose I have a data frame data with strings that I want converted to indicators. I use pandas.get_dummies(data) to convert this to a dataset that I can now use for building a model.

现在,我有一个新的观察值,我想遍历我的模型.显然,我不能使用pandas.get_dummies(new_data),因为它不包含所有类,并且不会创建相同的指标矩阵.有什么好方法吗?

Now I have a single new observation that I want to run through my model. Obviously I can't use pandas.get_dummies(new_data) because it doesn't contain all of the classes and won't make the same indicator matrices. Is there a good way to do this?

推荐答案

您可以根据单个新观察值创建虚拟对象,然后使用原始指标矩阵中的列重新索引这些框架列:

you can create the dummies from the single new observation, and then reindex this frames columns using the columns from the original indicator matrix:

import pandas as pd
df = pd.DataFrame({'cat':['a','b','c','d'],'val':[1,2,5,10]})
df1 = pd.get_dummies(pd.DataFrame({'cat':['a'],'val':[1]}))
dummies_frame = pd.get_dummies(df)
df1.reindex(columns = dummies_frame.columns, fill_value=0)

返回:

        val     cat_a   cat_b   cat_c   cat_d
  0     1       1       0       0       0

这篇关于从pandas.get_dummies转换到新数据的简单方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆