如何实现一次需要所有输出的成本函数 [英] How do I implement a cost function that requires all outputs at once

查看:25
本文介绍了如何实现一次需要所有输出的成本函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个成本函数,它需要神经网络的所有输出(即在某个训练时间步长范围内)的顺序来计算其成本.

Suppose I have a cost function that requires all outputs of a neural network (ie over some range of training time steps) order to calculate its cost.

这方面的一个例子是,网络对未来训练数据的行为会影响成本.例如.可能会训练网络在赛道上驾驶模拟汽车,成本是完成时间或撞车时间.

An example of this is where the behaviour of the network against future training data will affect the cost. E.g. the network might be trained to drive a simulated car around a track, and the cost is the time to finish or time to crash.

在 tensorflow 中实现这一目标的方法是什么?

What is the way to achieve this in tensorflow?

推荐答案

标准方法是使用循环神经网络(序列数据,您可以在其中计算某些或所有序列步骤的损失函数)或强化学习您只会在未来某个不确定的时刻获得奖励(例如,在课程结束时,您会因速度更快而获得更好的奖励).

The standard approaches would be to use Recurrent Neural Networks (sequence data where you can compute a loss function at some or all of the sequence steps), or Reinforcement Learning where you have only a reward at some indeterminant point in the future (e.g. at the end of the course you get a better reward for being faster).

这是一个关于在 tensorflow 中实现 RNN 的好教程:

Here's a good tutorial on implementing RNNs in tensorflow:

https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/3_NeuralNetworks/dynamic_rnn.ipynb

这是我发现的强化学习简介:

Here's an introduction to reinforcement learning that I found:

https://medium.com/@curiousily/getting-your-feet-rewarded-deep-reinforcement-learning-for-hackers-part-0-900ca5bb83e5

这两种模型都是您可以用来解决问题的模型,具体取决于您希望如何构建问题.Tensorflow 是一个通用数学库,提供自动微分和 GPU 支持,您可以在 tensorflow 之上构建任何这些模型.

Both of these are types of models you might employ to solve your problem depending on how you want to structure your problem. Tensorflow is a generic math library that provides automatic differentiation and GPU support, you can build any of these models on top of tensorflow.

这篇关于如何实现一次需要所有输出的成本函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆