将StandardScaler应用于数据集的一部分 [英] Apply StandardScaler to parts of a data set

查看：55 发布时间：2021/4/29 18:43:30 python pandas scikit-learn scale data-science

本文介绍了将StandardScaler应用于数据集的一部分的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想使用 sklearn 的 StandardScaler .可以将其应用于某些功能列，但不能应用于其他功能列吗?

I want to use sklearn's StandardScaler. Is it possible to apply it to some feature columns but not others?

例如，假设我的数据是:

data = pd.DataFrame({'Name' : [3, 4,6], 'Age' : [18, 92,98], 'Weight' : [68, 59,49]})

   Age  Name  Weight
0   18     3      68
1   92     4      59
2   98     6      49


col_names = ['Name', 'Age', 'Weight']
features = data[col_names]

我适合并转换数据

scaler = StandardScaler().fit(features.values)
features = scaler.transform(features.values)
scaled_features = pd.DataFrame(features, columns = col_names)

       Name       Age    Weight
0 -1.069045 -1.411004  1.202703
1 -0.267261  0.623041  0.042954
2  1.336306  0.787964 -1.245657

但是名称当然不是整数，而是字符串，我不想对其进行标准化.如何仅将 fit 和 transform 方法应用于 Age 和 Weight 列?

But of course the names are not really integers but strings and I don't want to standardize them. How can I apply the fit and transform methods only on the columns Age and Weight?

更新:

目前解决此问题的最佳方法是使用ColumnTransformer，如此处所述.

首先创建数据框的副本:

First create a copy of your dataframe:

scaled_features = data.copy()

在转换中不包括名称"列:

Don't include the Name column in the transformation:

col_names = ['Age', 'Weight']
features = scaled_features[col_names]
scaler = StandardScaler().fit(features.values)
features = scaler.transform(features.values)

现在，不要创建新的数据框，而是将结果分配给这两列:

Now, don't create a new dataframe but assign the result to those two columns:

scaled_features[col_names] = features
print(scaled_features)


        Age  Name    Weight
0 -1.411004     3  1.202703
1  0.623041     4  0.042954
2  0.787964     6 -1.245657

这篇关于将StandardScaler应用于数据集的一部分的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

将StandardScaler应用于数据集的一部分 [英] Apply StandardScaler to parts of a data set

问题描述

推荐答案

更新:

相关文章

Python最新文章

热门教程

热门工具

登录关闭

将StandardScaler应用于数据集的一部分 [英] Apply StandardScaler to parts of a data set

问题描述

推荐答案

更新:

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭