keras.losses.sparse_categorical_crossentropy的实现是什么样的? [英] What does the implementation of keras.losses.sparse_categorical_crossentropy look like?

查看:659
本文介绍了keras.losses.sparse_categorical_crossentropy的实现是什么样的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我发现 tf.keras.losses.sparse_categorical_crossentropy 是一个了不起的类,可以帮助我为具有大量输出类的神经网络创建损失函数.没有这个,就不可能训练模型,因为我发现 tf.keras.losses.categorical_crossentropy 导致内存不足错误,因为将索引转换为非常大的1-hot向量

I found tf.keras.losses.sparse_categorical_crossentropy is an amazing class that helps me create a loss function for a neural network that has a large number of output classes. Without this it is impossible to train the model, as I found tf.keras.losses.categorical_crossentropy gave an out-of-memory error because of converting an index into a 1-hot vector of very large size.

但是,我在理解 sparse_categorical_crossentropy 如何避免出现大内存问题方面存在问题.我看了代码来自TF,但要了解幕后的情况确实不容易.

I, however, have a problem of understanding how sparse_categorical_crossentropy avoids the big memory issue. I took a look at the code from TF but it is indeed not easy to know what goes under the hood.

那么,任何人都可以提出一些实现此目标的高级想法吗?实现是什么样的? 谢谢!

So, could anyone give some high-level idea of implementing this? What does the implementation look like? Thank you!

推荐答案

它没有做任何特殊的事情,它只是在丢失的数据中生成一批编码的单热编码标签(并非同时显示所有数据) ,在需要时将其丢弃,然后将其丢弃.因此,这只是内存和计算之间的经典折衷.

It does not do anything special, it just produces the one-hot encoded labels inside the loss for a batch of data (not all data at the same time), when it is needed, and then discards the results. So its just a classic trade-off between memory and computation.

这篇关于keras.losses.sparse_categorical_crossentropy的实现是什么样的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆