二进制交叉熵损失计算中np.dot和np.np与np.sum的差 [英] Difference between np.dot and np.multiply with np.sum in binary cross-entropy loss calculation

查看:254
本文介绍了二进制交叉熵损失计算中np.dot和np.np与np.sum的差的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试了以下代码,但是没有找到 np.dot np.np.np.sum

I have tried the following code but didn't find the difference between np.dot and np.multiply with np.sum

这是 np.dot 代码

logprobs = np.dot(Y, (np.log(A2)).T) + np.dot((1.0-Y),(np.log(1 - A2)).T)
print(logprobs.shape)
print(logprobs)
cost = (-1/m) * logprobs
print(cost.shape)
print(type(cost))
print(cost)

其输出为

(1, 1)
[[-2.07917628]]
(1, 1)
<class 'numpy.ndarray'>
[[ 0.693058761039 ]]

这是 np.np.sum与np.sum

logprobs = np.sum(np.multiply(np.log(A2), Y) + np.multiply((1 - Y), np.log(1 - A2)))
print(logprobs.shape)         
print(logprobs)
cost = - logprobs / m
print(cost.shape)
print(type(cost))
print(cost)

其输出为

()
-2.07917628312
()
<class 'numpy.float64'>
0.693058761039

我无法理解类型和形状的差异,而两种情况下的结果值都是相同的

I'm unable to understand the type and shape difference whereas the result value is same in both cases

即使在压缩以前的代码的情况下,成本值与后来的相同,但类型保持不变

Even in the case of squeezing former code cost value become same as later but type remains same

cost = np.squeeze(cost)
print(type(cost))
print(cost)

输出为

<class 'numpy.ndarray'>
0.6930587610394646

推荐答案

您正在执行的操作是计算 二进制交叉熵损失 ,它测量模型的预测(此处为A2)与真实输出(此处为Y)相比有多糟糕

What you're doing is calculating the binary cross-entropy loss which measures how bad the predictions (here: A2) of the model are when compared to the true outputs (here: Y).

以下是您的案例的可复制示例,该示例应说明为什么在第二种情况下使用np.sum

Here is a reproducible example for your case, which should explain why you get a scalar in the second case using np.sum

In [88]: Y = np.array([[1, 0, 1, 1, 0, 1, 0, 0]])

In [89]: A2 = np.array([[0.8, 0.2, 0.95, 0.92, 0.01, 0.93, 0.1, 0.02]])

In [90]: logprobs = np.dot(Y, (np.log(A2)).T) + np.dot((1.0-Y),(np.log(1 - A2)).T)

# `np.dot` returns 2D array since its arguments are 2D arrays
In [91]: logprobs
Out[91]: array([[-0.78914626]])

In [92]: cost = (-1/m) * logprobs

In [93]: cost
Out[93]: array([[ 0.09864328]])

In [94]: logprobs = np.sum(np.multiply(np.log(A2), Y) + np.multiply((1 - Y), np.log(1 - A2)))

# np.sum returns scalar since it sums everything in the 2D array
In [95]: logprobs
Out[95]: -0.78914625761870361

请注意, np.dot 仅沿求和的内部尺寸求和,此处与(1x8) and (8x1)相匹配.因此,8将在点积或矩阵乘法期间消失,其结果为(1x1),它只是一个标量,但返回为形状为(1,1)的2D数组.

Note that the np.dot sums along only the inner dimensions which match here (1x8) and (8x1). So, the 8s will be gone during the dot product or matrix multiplication yielding the result as (1x1) which is just a scalar but returned as 2D array of shape (1,1).

最重要的是,请注意此处 np.dotnp.matmul 完全相同,因为输入是2D数组(即矩阵)

Also, most importantly note that here np.dot is exactly same as doing np.matmul since the inputs are 2D arrays (i.e. matrices)

In [107]: logprobs = np.matmul(Y, (np.log(A2)).T) + np.matmul((1.0-Y),(np.log(1 - A2)).T)

In [108]: logprobs
Out[108]: array([[-0.78914626]])

In [109]: logprobs.shape
Out[109]: (1, 1)


标量值的形式返回结果

np.dot np.matmul 根据输入数组返回结果数组的形状.如果输入是2D数组,即使使用out=自变量,也不可能返回标量.但是,我们可以使用 np.asscalar() 如果结果数组的形状为(1,1)(或更一般而言,包装在nD数组中的 scalar 值),则将结果转换为标量


Return result as a scalar value

np.dot or np.matmul returns whatever the resulting array shape would be, based on input arrays. Even with out= argument it's not possible to return a scalar, if the inputs are 2D arrays. However, we can use np.asscalar() on the result to convert it to a scalar if the result array is of shape (1,1) (or more generally a scalar value wrapped in an nD array)

In [123]: np.asscalar(logprobs)
Out[123]: -0.7891462576187036

In [124]: type(np.asscalar(logprobs))
Out[124]: float


ndarray 的大小为1至标量

In [127]: np.asscalar(np.array([[[23.2]]]))
Out[127]: 23.2

In [128]: np.asscalar(np.array([[[[23.2]]]]))
Out[128]: 23.2

这篇关于二进制交叉熵损失计算中np.dot和np.np与np.sum的差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆