python pandas标准化列以进行回归 [英] python pandas standardize column for regression

查看：343 发布时间：2020/5/24 4:08:40 python pandas scale normalize standardized

本文介绍了python pandas标准化列以进行回归的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下df:

Date       Event_Counts   Category_A  Category_B
20170401      982457          0           1
20170402      982754          1           0
20170402      875786          0           1

我正在准备用于回归分析的数据，并希望对Event_Counts列进行标准化，以便与类别相似.

I am preparing the data for a regression analysis and want to standardize the column Event_Counts, so that it's on a similar scale like the categories.

我使用以下代码:

from sklearn import preprocessing
df['scaled_event_counts'] = preprocessing.scale(df['Event_Counts'])

虽然我收到此警告:

DataConversionWarning: Data with input dtype int64 was converted to float64 by the scale function.
  warnings.warn(msg, _DataConversionWarning)

它似乎起作用了；有一个新列.但是，它的负数是-1.3

it seems to have worked; there is a new column. However, it has negative numbers like -1.3

我认为比例函数的作用是从数字中减去平均值，然后将其除以每一行的标准差；然后将结果的最小值添加到每一行.

What I thought the scale function does is subtract the mean from the number and divide it by the standard deviation for every row; then add the min of the result to every row.

那样对熊猫不起作用吗?还是应该使用normalize()函数或StandardScaler()函数?我希望将标准化列的范围设置为0到1.

Does it not work for pandas that way? Or should I use the normalize() function or StandardScaler() function? I wanted to have the standardize column on a scale of 0 to 1.

谢谢

推荐答案

我认为您正在寻找

I think you are looking for the sklearn.preprocessing.MinMaxScaler. That will allow you to scale to a given range.

因此，您的情况应该是:

So in your case it would be:

scaler = preprocessing.MinMaxScaler(feature_range=(0,1))
df['scaled_event_counts'] = scaler.fit_transform(df['Event_Counts'])

要缩放整个df:

scaled_df = scaler.fit_transform(df)
print(scaled_df)
[[ 0.          0.99722347  0.          1.        ]
 [ 1.          1.          1.          0.        ]
 [ 1.          0.          0.          1.        ]]

这篇关于python pandas标准化列以进行回归的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

python pandas标准化列以进行回归 [英] python pandas standardize column for regression

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

python pandas标准化列以进行回归 [英] python pandas standardize column for regression

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭