神经网络和二元分类指导 [英] Neural Network and Binary classification Guidance

查看:25
本文介绍了神经网络和二元分类指导的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据 (X) 存储在一个 numpy 数组中:

I have the following data (X) that is stored in a numpy array:

array([[ 0.82737724, -0.5924806 ,  0.43279337, ...,  0.91896631,
        -0.28188124,  0.58595414],
       [-1.56610693,  0.63878901,  0.43279337, ...,  1.28262456,
         1.16154512, -1.9423032 ],
       [ 0.82737724, -0.2846632 , -0.4745452 , ...,  1.64628282,
        -0.28188124,  0.58595414],
       ...,
       [ 0.82737724,  0.        ,  0.43279337, ...,  1.67617254,
        -0.28188124,  0.58595414],
       [-1.56610693, -0.2846632 , -0.4745452 , ..., -1.64656796,
         0.27001707, -1.9423032 ],
       [ 0.82737724,  0.17706291, -0.4745452 , ...,  0.63501397,
        -0.28188124, -0.67817453]])

数组要大得多,它被输入到这个神经网络中:

The array is much larger, and it gets fed into this neural network:

def base_model1():
    input_dim = X.shape[1]
    output_dim = 1
    model = Sequential()
    model.add(Dense(10, input_dim= input_dim,kernel_initializer ='normal', activation= 'tanh'))
    model.add(Dense(1, input_dim = 100, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['MeanSquaredError',
        'AUC',])
    
    return model
NN_clf = KerasClassifier(build_fn=base_model1, epochs=100, verbose=1)
NN_clf._estimator_type = "classifier"
trained = NN_clf.fit(X,y.values.reshape(-1,1))

Y 是二进制的 1 和 0.其中 1 表示会乘坐出租车或 0 表示不会乘坐出租车.

Y is binary ones and zeroes. Where 1 means that will ride a taxi or 0 that will not ride a taxi.

predictions1 = trained.model.predict(X_test, verbose=1)
predictions1[:5]
array([[0.09048176],
       [0.34411064],
       [0.08842686],
       [0.0986585 ],
       [0.58971184]], dtype=float32)

我的问题源于这里,如果 Sigmoid 是执行二元分类或这些概率输出的激活层?因为我期待 1 和 0,所以我最终假设这些是我创建的概率输出:

My question stems from here if Sigmoid is an activation layer that performs binary classification or these probability outputs? Because I was expecting 1's and 0's I eventually assuming that these are probability outputs I created the following:

blank = []
for i in pd.DataFrame(predictions1)[0].to_list():
    if i > .50:
        blank.append(1)
    else:
        blank.append(0)

我的大部分困惑在于二元分类神经网络如何处理它们,以及如何得到 1 和 0.

Much of my confusion lies in binary classification how does a neural network handle them, and how does one get 1's and 0's.

推荐答案

当你将一些 input 用于预测传递给你的二元分类器(sigmoid 在其最后一层激活),它将为您提供矩阵,其中每一行代表这些输入的概率属于class 1.在你的情况下:

When you pass some input for prediction to your binary classifier (sigmoid activation in its last layer), it will give you matrices in which each row represents the probability of those inputs to be in class 1. In your case:

predictions1 = trained.model.predict(X_test, verbose=1)
predictions1[:5]
array([[0.09048176],
       [0.34411064],
       [0.08842686],
       [0.0986585 ],
       [0.58971184]],

这里,每个分数代表X_test[:5]中每个样本属于class 1的可能性.从这一点来看,为了得到类标签(例如10),API默认使用0.5阈值来考虑每个分数属于class 1class 0;更具体地说,分数大于 0.5 被认为是 class 1.但当然,我们可以调整阈值.这是一个虚拟示例

Here, each score represents the possibility of each sample in X_test[:5] to be in class 1. From this point, in order to get class labels (e.g. 1 and 0), by default API uses the 0.5 threshold to consider each score belong to class 1 and class 0; more specifically, score greater than 0.5 are considered to class 1. But of course, we can tweak the threshold. Here is one dummy example

import tensorflow as tf
import numpy as np  

img = tf.random.normal([20, 32], 0, 1, tf.float32)
tar = np.random.randint(2, size=(20, 1))

model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(10, input_dim = 32, 
                       kernel_initializer ='normal', activation= 'relu'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', 
              optimizer='adam', metrics=['accuracy'])
model.fit(img, tar, epochs=5, verbose=2)

Epoch 1/5
1/1 - 0s - loss: 0.7058 - accuracy: 0.5500
Epoch 2/5
1/1 - 0s - loss: 0.6961 - accuracy: 0.5500
Epoch 3/5
1/1 - 0s - loss: 0.6869 - accuracy: 0.5500
Epoch 4/5
1/1 - 0s - loss: 0.6779 - accuracy: 0.6000
Epoch 5/5
1/1 - 0s - loss: 0.6692 - accuracy: 0.6000

概率

y_pred = model.predict(img)
print(y_pred.shape)
y_pred[:10]

(20, 1)
array([[0.5317636 ],
       [0.4592613 ],
       [0.5876541 ],
       [0.47071406],
       [0.56284094],
       [0.5025074 ],
       [0.46471453],
       [0.38649547],
       [0.43361676],
       [0.4667967 ]], dtype=float32)

类标签

(model.predict(img) > 0.5).astype("int32")
array([[1],
       [0],
       [1],
       [0],
       [1],
       [1],
       [0],
       [0],
       [0],
       [0],
       [0],
....
....

这篇关于神经网络和二元分类指导的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆