神经网络和二进制分类指导 [英] Neural Network and Binary classification Guidance

查看:68
本文介绍了神经网络和二进制分类指导的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将以下数据(X)存储在numpy数组中:

I have the following data (X) that is stored in a numpy array:

array([[ 0.82737724, -0.5924806 ,  0.43279337, ...,  0.91896631,
        -0.28188124,  0.58595414],
       [-1.56610693,  0.63878901,  0.43279337, ...,  1.28262456,
         1.16154512, -1.9423032 ],
       [ 0.82737724, -0.2846632 , -0.4745452 , ...,  1.64628282,
        -0.28188124,  0.58595414],
       ...,
       [ 0.82737724,  0.        ,  0.43279337, ...,  1.67617254,
        -0.28188124,  0.58595414],
       [-1.56610693, -0.2846632 , -0.4745452 , ..., -1.64656796,
         0.27001707, -1.9423032 ],
       [ 0.82737724,  0.17706291, -0.4745452 , ...,  0.63501397,
        -0.28188124, -0.67817453]])

数组更大了,它被馈入了这个神经网络:

The array is much larger, and it gets fed into this neural network:

def base_model1():
    input_dim = X.shape[1]
    output_dim = 1
    model = Sequential()
    model.add(Dense(10, input_dim= input_dim,kernel_initializer ='normal', activation= 'tanh'))
    model.add(Dense(1, input_dim = 100, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['MeanSquaredError',
        'AUC',])
    
    return model
NN_clf = KerasClassifier(build_fn=base_model1, epochs=100, verbose=1)
NN_clf._estimator_type = "classifier"
trained = NN_clf.fit(X,y.values.reshape(-1,1))

Y是二进制1和0.其中1表示将乘坐出租车,而0表示将不乘坐出租车.

Y is binary ones and zeroes. Where 1 means that will ride a taxi or 0 that will not ride a taxi.

predictions1 = trained.model.predict(X_test, verbose=1)
predictions1[:5]
array([[0.09048176],
       [0.34411064],
       [0.08842686],
       [0.0986585 ],
       [0.58971184]], dtype=float32)

我的问题出在这里,如果Sigmoid是执行二进制分类的激活层还是这些概率输出?因为我一直期望1和0,所以我最终假设这些是概率输出,因此我创建了以下内容:

My question stems from here if Sigmoid is an activation layer that performs binary classification or these probability outputs? Because I was expecting 1's and 0's I eventually assuming that these are probability outputs I created the following:

blank = []
for i in pd.DataFrame(predictions1)[0].to_list():
    if i > .50:
        blank.append(1)
    else:
        blank.append(0)

我的大部分困惑在于二进制分类,神经网络如何处理它们,以及如何获得1和0.

Much of my confusion lies in binary classification how does a neural network handle them, and how does one get 1's and 0's.

推荐答案

当您将一些 input 进行预测时,将其传递给二进制分类器(在最后一层激活 sigmoid ),它将为您提供矩阵,其中每一行代表这些输入的概率 class 1 中.就您而言:

When you pass some input for prediction to your binary classifier (sigmoid activation in its last layer), it will give you matrices in which each row represents the probability of those inputs to be in class 1. In your case:

predictions1 = trained.model.predict(X_test, verbose=1)
predictions1[:5]
array([[0.09048176],
       [0.34411064],
       [0.08842686],
       [0.0986585 ],
       [0.58971184]],

在这里,每个分数代表 X_test [:5] 中的每个样本属于 class 1 的可能性.从这一点出发,为了获得类标签(例如 1 0 ),默认情况下,API使用 0.5 阈值来考虑每个得分属于到 class 1 class 0 ;更具体地说,大于 0.5 的得分被视为 class 1 .但是,当然,我们可以调整阈值.这是一个虚拟的例子

Here, each score represents the possibility of each sample in X_test[:5] to be in class 1. From this point, in order to get class labels (e.g. 1 and 0), by default API uses the 0.5 threshold to consider each score belong to class 1 and class 0; more specifically, score greater than 0.5 are considered to class 1. But of course, we can tweak the threshold. Here is one dummy example

import tensorflow as tf
import numpy as np  

img = tf.random.normal([20, 32], 0, 1, tf.float32)
tar = np.random.randint(2, size=(20, 1))

model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(10, input_dim = 32, 
                       kernel_initializer ='normal', activation= 'relu'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', 
              optimizer='adam', metrics=['accuracy'])
model.fit(img, tar, epochs=5, verbose=2)

Epoch 1/5
1/1 - 0s - loss: 0.7058 - accuracy: 0.5500
Epoch 2/5
1/1 - 0s - loss: 0.6961 - accuracy: 0.5500
Epoch 3/5
1/1 - 0s - loss: 0.6869 - accuracy: 0.5500
Epoch 4/5
1/1 - 0s - loss: 0.6779 - accuracy: 0.6000
Epoch 5/5
1/1 - 0s - loss: 0.6692 - accuracy: 0.6000

概率

y_pred = model.predict(img)
print(y_pred.shape)
y_pred[:10]

(20, 1)
array([[0.5317636 ],
       [0.4592613 ],
       [0.5876541 ],
       [0.47071406],
       [0.56284094],
       [0.5025074 ],
       [0.46471453],
       [0.38649547],
       [0.43361676],
       [0.4667967 ]], dtype=float32)

类标签

(model.predict(img) > 0.5).astype("int32")
array([[1],
       [0],
       [1],
       [0],
       [1],
       [1],
       [0],
       [0],
       [0],
       [0],
       [0],
....
....

这篇关于神经网络和二进制分类指导的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆