深度学习不擅长拟合训练范围之外的简单非线性函数(外推)? [英] Is deep learning bad at fitting simple non linear functions outside training scope (extrapolating)?

查看:26
本文介绍了深度学习不擅长拟合训练范围之外的简单非线性函数(外推)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个简单的基于深度学习的模型来预测 y=x**2但是看起来深度学习无法学习训练集范围之外的一般功能.

直觉上我可以认为神经网络可能无法拟合 y=x**2,因为输入之间不涉及乘法.

请注意,我不是在问如何创建适合 x**2 的模型.我已经做到了.我想知道以下问题的答案:

  1. 我的分析是否正确?
  2. 如果 1 的答案是肯定的,那么深度学习的预测范围是不是非常有限?
  3. 是否有更好的算法来预测训练数据范围内外的 y = x**2 等函数?

完成笔记本的路径:

训练代码

def getSequentialModel():模型 = 顺序()model.add(layers.Dense(8, kernel_regularizer=regularizers.l2(0.001), activation='relu', input_shape = (1,)))模型.add(layers.Dense(1))打印(模型.摘要())回报模式定义运行模型(模型):模型.编译(优化器=优化器.rmsprop(lr=0.01),损失='mse')从 keras.callbacks 导入 EarlyStoppingearly_stopping_monitor = EarlyStopping(耐心=5)h = model.fit(x_train,y,validation_split=0.2,时代= 300,批量大小=32,详细=假,回调=[early_stopping_monitor])_________________________________________________________________层(类型)输出形状参数#==================================================================密集_18(密集)(无,8)16_________________________________________________________________密集_19(密集)(无,1)9==================================================================总参数:25可训练参数:25不可训练的参数:0_________________________________________________________________

对随机测试集的评估

这个例子中的深度学习并不擅长预测简单的非线性函数.但擅长预测训练数据样本空间中的值.

解决方案

  1. 我的分析是否正确?

鉴于我在评论中的评论,您的网络肯定不,让我们接受您的分析确实是正确的(毕竟,您的模型在其内部似乎做得很好培训范围),以解决您的第二个问题,这是有趣的问题.

<块引用>

  1. 如果 1 的答案是肯定的,那么深度学习的预测范围是不是非常有限?

嗯,这种问题并不完全适合 SO,因为非常有限"的确切含义可以说不清楚......

那么,让我们试着重新表述一下:我们是否应该期望深度学习模型能够预测这样的数值函数超出它们所训练的数值域?

来自不同领域的一个例子在这里可能很有启发性:假设我们已经建立了一个能够检测 &以非常高的准确度识别照片中的动物(这不是假设;确实存在这样的模型);当相同的模型在这些相同的照片中无法检测和识别飞机(或树木、冰箱等 - 你能说出它)时,我们应该抱怨吗?

这样说吧,答案很明确&很明显 - 我们不应该抱怨,事实上,我们当然不会对这种行为感到惊讶.

我们人类很容易认为这样的模型应该能够外推,尤其是在数字领域,因为这是我们自己很容易"做的事情;但是 ML 模型虽然非常擅长内插,但它们在外推任务中却惨遭失败,例如您在此处介绍的那个.

为了让它更直观,认为这些模型的整个世界"都被限制在他们训练集的领域中:我上面的示例模型将能够概括和识别动物看不见的照片,只要这些动物介于"(注意引号)它在训练期间看到的那些动物;以类似的方式,您的模型可以很好地预测用于训练的样本之间 参数的函数值.但在这两种情况下,这些模型都不应超出其训练领域(即外推).除了动物之外,我的示例模型没有世界",类似地,对于 [-500, 500] 之外的模型......

为了佐证,请考虑 DeepMind 最近发表的论文

关于你的第三个问题:

<块引用>

  1. 是否有更好的算法来预测训练数据范围内外的 y = x**2 等函数?

现在应该很清楚,这是当前研究的(热门)领域;对于初学者,请参阅上面的论文...

<小时>

那么,DL 模型有限制吗?绝对 - 在可预见的未来忘记有关 AGI 的可怕故事.正如你所说,它们非常吗?好吧,我不知道……但是,鉴于它们在推断方面的局限性,它们有用吗?

这可以说是真正令人感兴趣的问题,答案显然是 - 见鬼了

I am trying to create a simple deep-learning based model to predict y=x**2 But looks like deep learning is not able to learn the general function outside the scope of its training set.

Intuitively I can think that neural network might not be able to fit y=x**2 as there is no multiplication involved between the inputs.

Please note I am not asking how to create a model to fit x**2. I have already achieved that. I want to know the answers to following questions:

  1. Is my analysis correct?
  2. If the answer to 1 is yes, then isn't the prediction scope of deep learning very limited?
  3. Is there a better algorithm for predicting functions like y = x**2 both inside and outside the scope of training data?

Path to complete notebook: https://github.com/krishansubudhi/MyPracticeProjects/blob/master/KerasBasic-nonlinear.ipynb

training input:

x = np.random.random((10000,1))*1000-500
y = x**2
x_train= x

training code

def getSequentialModel():
    model = Sequential()
    model.add(layers.Dense(8, kernel_regularizer=regularizers.l2(0.001), activation='relu', input_shape = (1,)))
    model.add(layers.Dense(1))
    print(model.summary())
    return model

def runmodel(model):
    model.compile(optimizer=optimizers.rmsprop(lr=0.01),loss='mse')
    from keras.callbacks import EarlyStopping
    early_stopping_monitor = EarlyStopping(patience=5)
    h = model.fit(x_train,y,validation_split=0.2,
             epochs= 300,
             batch_size=32,
             verbose=False,
             callbacks=[early_stopping_monitor])


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_18 (Dense)             (None, 8)                 16        
_________________________________________________________________
dense_19 (Dense)             (None, 1)                 9         
=================================================================
Total params: 25
Trainable params: 25
Non-trainable params: 0
_________________________________________________________________

Evaluation on random test set

Deep learning in this example is not good at predicting a simple non linear function. But good at predicting values in the sample space of training data.

解决方案

  1. Is my analysis correct?

Given my remarks in the comments that your network is certainly not deep, let's accept that your analysis is indeed correct (after all, your model does seem to do a good job inside its training scope), in order to get to your 2nd question, which is the interesting one.

  1. If the answer to 1 is yes, then isn't the prediction scope of deep learning very limited?

Well, this is the kind of questions not exactly suitable for SO, since the exact meaning of "very limited" is arguably unclear...

So, let's try to rephrase it: should we expect DL models to predict such numerical functions outside the numeric domain on which they have been trained?

An example from a different domain may be enlightening here: suppose we have built a model able to detect & recognize animals in photos with very high accuracy (it is not hypothetical; such models do exist indeed); should we complain when the very same model cannot detect and recognize airplanes (or trees, refrigerators etc - you name it) in these same photos?

Put like that, the answer is a clear & obvious no - we should not complain, and in fact we are certainly not even surprised by such a behavior in the first place.

It is tempting for us humans to think that such models should be able to extrapolate, especially in the numeric domain, since this is something we do very "easily" ourselves; but ML models, while exceptionally good at interpolating, they fail miserably in extrapolation tasks, such as the one you present here.

Trying to make it more intuitive, think that the whole "world" of such models is confined in the domain of their training sets: my example model above would be able to generalize and recognize animals in unseen photos as long as these animals are "between" (mind the quotes) the ones it has seen during training; in a similar manner, your model does a good job predicting the function value for arguments between the sample you have used for training. But in neither case these models are expected to go beyond their training domain (i.e. extrapolate). There is no "world" for my example model beyond animals, and similarly for your model beyond [-500, 500]...

For corroboration, consider the very recent paper Neural Arithmetic Logic Units, by DeepMind; quoting from the abstract:

Neural networks can learn to represent and manipulate numerical information, but they seldom generalize well outside of the range of numerical values encountered during training.

See also a relevant tweet of a prominent practitioner:

On to your third question:

  1. Is there a better algorithm for predicting functions like y = x**2 both inside and outside the scope of training data?

As it should be clear by now, this is a (hot) area of current research; see the above paper for starters...


So, are DL models limited? Definitely - forget the scary tales about AGI for the foreseeable future. Are they very limited, as you put it? Well, I don't know... But, given their limitation in extrapolating, are they useful?

This is arguably the real question of interest, and the answer is obviously - hell, yeah!

这篇关于深度学习不擅长拟合训练范围之外的简单非线性函数(外推)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆