神经网络和二进制分类指导 [英] Neural Network and Binary classification Guidance
问题描述
我将以下数据(X)存储在numpy数组中:
I have the following data (X) that is stored in a numpy array:
array([[ 0.82737724, -0.5924806 , 0.43279337, ..., 0.91896631,
-0.28188124, 0.58595414],
[-1.56610693, 0.63878901, 0.43279337, ..., 1.28262456,
1.16154512, -1.9423032 ],
[ 0.82737724, -0.2846632 , -0.4745452 , ..., 1.64628282,
-0.28188124, 0.58595414],
...,
[ 0.82737724, 0. , 0.43279337, ..., 1.67617254,
-0.28188124, 0.58595414],
[-1.56610693, -0.2846632 , -0.4745452 , ..., -1.64656796,
0.27001707, -1.9423032 ],
[ 0.82737724, 0.17706291, -0.4745452 , ..., 0.63501397,
-0.28188124, -0.67817453]])
数组更大了,它被馈入了这个神经网络:
The array is much larger, and it gets fed into this neural network:
def base_model1():
input_dim = X.shape[1]
output_dim = 1
model = Sequential()
model.add(Dense(10, input_dim= input_dim,kernel_initializer ='normal', activation= 'tanh'))
model.add(Dense(1, input_dim = 100, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['MeanSquaredError',
'AUC',])
return model
NN_clf = KerasClassifier(build_fn=base_model1, epochs=100, verbose=1)
NN_clf._estimator_type = "classifier"
trained = NN_clf.fit(X,y.values.reshape(-1,1))
Y是二进制1和0.其中1表示将乘坐出租车,而0表示将不乘坐出租车.
Y is binary ones and zeroes. Where 1 means that will ride a taxi or 0 that will not ride a taxi.
predictions1 = trained.model.predict(X_test, verbose=1)
predictions1[:5]
array([[0.09048176],
[0.34411064],
[0.08842686],
[0.0986585 ],
[0.58971184]], dtype=float32)
我的问题出在这里,如果Sigmoid是执行二进制分类的激活层还是这些概率输出?因为我一直期望1和0,所以我最终假设这些是概率输出,因此我创建了以下内容:
My question stems from here if Sigmoid is an activation layer that performs binary classification or these probability outputs? Because I was expecting 1's and 0's I eventually assuming that these are probability outputs I created the following:
blank = []
for i in pd.DataFrame(predictions1)[0].to_list():
if i > .50:
blank.append(1)
else:
blank.append(0)
我的大部分困惑在于二进制分类,神经网络如何处理它们,以及如何获得1和0.
Much of my confusion lies in binary classification how does a neural network handle them, and how does one get 1's and 0's.
推荐答案
当您将一些 input
进行预测时,将其传递给二进制分类器(在最后一层激活 sigmoid
),它将为您提供矩阵,其中每一行代表这些输入的概率在 class 1
中.就您而言:
When you pass some input
for prediction to your binary classifier (sigmoid
activation in its last layer), it will give you matrices in which each row represents the probability of those inputs to be in class 1
. In your case:
predictions1 = trained.model.predict(X_test, verbose=1)
predictions1[:5]
array([[0.09048176],
[0.34411064],
[0.08842686],
[0.0986585 ],
[0.58971184]],
在这里,每个分数代表 X_test [:5]
中的每个样本属于 class 1
的可能性.从这一点出发,为了获得类标签(例如 1
和 0
),默认情况下,API使用 0.5
阈值来考虑每个得分属于到 class 1
和 class 0
;更具体地说,大于 0.5
的得分被视为 class 1
.但是,当然,我们可以调整阈值.这是一个虚拟的例子
Here, each score represents the possibility of each sample in X_test[:5]
to be in class 1
. From this point, in order to get class labels (e.g. 1
and 0
), by default API uses the 0.5
threshold to consider each score belong to class 1
and class 0
; more specifically, score greater than 0.5
are considered to class 1
. But of course, we can tweak the threshold. Here is one dummy example
import tensorflow as tf
import numpy as np
img = tf.random.normal([20, 32], 0, 1, tf.float32)
tar = np.random.randint(2, size=(20, 1))
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(10, input_dim = 32,
kernel_initializer ='normal', activation= 'relu'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam', metrics=['accuracy'])
model.fit(img, tar, epochs=5, verbose=2)
Epoch 1/5
1/1 - 0s - loss: 0.7058 - accuracy: 0.5500
Epoch 2/5
1/1 - 0s - loss: 0.6961 - accuracy: 0.5500
Epoch 3/5
1/1 - 0s - loss: 0.6869 - accuracy: 0.5500
Epoch 4/5
1/1 - 0s - loss: 0.6779 - accuracy: 0.6000
Epoch 5/5
1/1 - 0s - loss: 0.6692 - accuracy: 0.6000
概率
y_pred = model.predict(img)
print(y_pred.shape)
y_pred[:10]
(20, 1)
array([[0.5317636 ],
[0.4592613 ],
[0.5876541 ],
[0.47071406],
[0.56284094],
[0.5025074 ],
[0.46471453],
[0.38649547],
[0.43361676],
[0.4667967 ]], dtype=float32)
类标签
(model.predict(img) > 0.5).astype("int32")
array([[1],
[0],
[1],
[0],
[1],
[1],
[0],
[0],
[0],
[0],
[0],
....
....
这篇关于神经网络和二进制分类指导的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!