深度学习是否不利于在训练范围之外(外推)拟合简单的非线性函数? [英] Is deep learning bad at fitting simple non linear functions outside training scope (extrapolating)?

查看:165
本文介绍了深度学习是否不利于在训练范围之外(外推)拟合简单的非线性函数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个简单的基于深度学习的模型来预测y=x**2 但是看起来深度学习无法在训练集范围之外学习一般功能.

I am trying to create a simple deep-learning based model to predict y=x**2 But looks like deep learning is not able to learn the general function outside the scope of its training set.

直觉上,我可以认为神经网络可能无法拟合y = x ** 2,因为输入之间没有乘法.

Intuitively I can think that neural network might not be able to fit y=x**2 as there is no multiplication involved between the inputs.

请注意,我并不是在问如何创建适合x**2的模型.我已经实现了.我想知道以下问题的答案:

Please note I am not asking how to create a model to fit x**2. I have already achieved that. I want to know the answers to following questions:

  1. 我的分析正确吗?
  2. 如果对1的答案为是,那么深度学习的预测范围不是很有限吗?
  3. 在训练数据范围之内和之外,是否有更好的算法来预测y = x ** 2之类的函数?

完成笔记本的路径: https://github.com/krishansubudhi/MyPracticeProjects/blob/master/KerasBasic-nonlinear.ipynb

Path to complete notebook: https://github.com/krishansubudhi/MyPracticeProjects/blob/master/KerasBasic-nonlinear.ipynb

培训输入:

x = np.random.random((10000,1))*1000-500
y = x**2
x_train= x

培训代码

def getSequentialModel():
    model = Sequential()
    model.add(layers.Dense(8, kernel_regularizer=regularizers.l2(0.001), activation='relu', input_shape = (1,)))
    model.add(layers.Dense(1))
    print(model.summary())
    return model

def runmodel(model):
    model.compile(optimizer=optimizers.rmsprop(lr=0.01),loss='mse')
    from keras.callbacks import EarlyStopping
    early_stopping_monitor = EarlyStopping(patience=5)
    h = model.fit(x_train,y,validation_split=0.2,
             epochs= 300,
             batch_size=32,
             verbose=False,
             callbacks=[early_stopping_monitor])


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_18 (Dense)             (None, 8)                 16        
_________________________________________________________________
dense_19 (Dense)             (None, 1)                 9         
=================================================================
Total params: 25
Trainable params: 25
Non-trainable params: 0
_________________________________________________________________

随机测试集评估

在此示例中,深度学习并不擅长预测简单的非线性函数.但是擅长预测训练数据样本空间中的值.

Deep learning in this example is not good at predicting a simple non linear function. But good at predicting values in the sample space of training data.

推荐答案

  1. 我的分析正确吗?

基于我在评论中的评论,您的网络肯定不是 deep ,让我们接受您的分析确实是正确的(毕竟,您的模型在其内部确实表现不错)培训范围),以便回答您的​​第二个问题.

Given my remarks in the comments that your network is certainly not deep, let's accept that your analysis is indeed correct (after all, your model does seem to do a good job inside its training scope), in order to get to your 2nd question, which is the interesting one.

  1. 如果对1的答案为是,那么深度学习的预测范围不是很有限吗?

嗯,这是一种不完全适合SO的问题,因为非常有限"的确切含义尚不清楚...

Well, this is the kind of questions not exactly suitable for SO, since the exact meaning of "very limited" is arguably unclear...

因此,让我们尝试重新表述一下:我们应该期望DL模型预测这样的数字函数吗?在训练范围内的数字域之外?

So, let's try to rephrase it: should we expect DL models to predict such numerical functions outside the numeric domain on which they have been trained?

来自其他领域的一个例子可能会启发我们:假设我们已经建立了一个能够检测&以非常高的准确度识别照片中的动物(这不是假设;这种模型的确存在);当相同的模型无法在这些相同的照片中检测到并识别出飞机(或树木,冰箱等)时,我们应该抱怨吗?

An example from a different domain may be enlightening here: suppose we have built a model able to detect & recognize animals in photos with very high accuracy (it is not hypothetical; such models do exist indeed); should we complain when the very same model cannot detect and recognize airplanes (or trees, refrigerators etc - you name it) in these same photos?

这样说,答案很明确& ;;显而易见的-我们不应该抱怨,事实上,我们一开始就不会为这样的行为感到惊讶.

Put like that, the answer is a clear & obvious no - we should not complain, and in fact we are certainly not even surprised by such a behavior in the first place.

我们很容易想到,这样的模型应该能够外推,尤其是在数字域中,因为这是我们自己非常轻松"地完成的事情;但是ML模型虽然非常擅长内插,但在外推任务(例如您在此处介绍的任务)中却失败了.

It is tempting for us humans to think that such models should be able to extrapolate, especially in the numeric domain, since this is something we do very "easily" ourselves; but ML models, while exceptionally good at interpolating, they fail miserably in extrapolation tasks, such as the one you present here.

试图使其更加直观,请认为此类模型的整个世界"都局限于其训练集的:我在上面的示例模型将能够概括并识别动物只要这些动物在训练过程中处于"它们之间(注意引号),就看不见照片;以类似的方式,您的模型可以很好地预测用于训练的样本之间的参数的函数值.但无论哪种情况,都不会期望这些模型超出其训练范围(即外推).除了动物以外,我的示例模型也没有世界",同样,[-500,500] ...之外的模型也没有世界".

Trying to make it more intuitive, think that the whole "world" of such models is confined in the domain of their training sets: my example model above would be able to generalize and recognize animals in unseen photos as long as these animals are "between" (mind the quotes) the ones it has seen during training; in a similar manner, your model does a good job predicting the function value for arguments between the sample you have used for training. But in neither case these models are expected to go beyond their training domain (i.e. extrapolate). There is no "world" for my example model beyond animals, and similarly for your model beyond [-500, 500]...

为佐证,请考虑一下DeepMind的最新论文神经算术逻辑单元;引用摘要:

For corroboration, consider the very recent paper Neural Arithmetic Logic Units, by DeepMind; quoting from the abstract:

神经网络可以学习表示和操纵数字信息,但很少能将其推广到训练过程中遇到的数字范围之外.

Neural networks can learn to represent and manipulate numerical information, but they seldom generalize well outside of the range of numerical values encountered during training.

另请参阅相关推文:

第三个问题:

  1. 在训练数据范围之内和之外,是否有更好的算法来预测像y = x**2这样的函数?
  1. Is there a better algorithm for predicting functions like y = x**2 both inside and outside the scope of training data?

目前应该清楚的是,这是当前研究的一个(热门)领域;请参阅上面的文章供初学者使用...

As it should be clear by now, this is a (hot) area of current research; see the above paper for starters...

那么,DL模型受到限制吗?绝对可以-在可预见的将来忘记有关AGI的恐怖故事.正如您所说,它们非常受限制吗?好吧,我不知道...但是,鉴于其推断的局限性,它们有用吗?

So, are DL models limited? Definitely - forget the scary tales about AGI for the foreseeable future. Are they very limited, as you put it? Well, I don't know... But, given their limitation in extrapolating, are they useful?

这可以说是真正有趣的问题,答案显然是-地狱,是的

This is arguably the real question of interest, and the answer is obviously - hell, yeah!

这篇关于深度学习是否不利于在训练范围之外(外推)拟合简单的非线性函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆