如何用现有的和较新的类微调Keras模型? [英] How to fine-tune a keras model with existing plus newer classes?

查看:131
本文介绍了如何用现有的和较新的类微调Keras模型?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

美好的一天!

我有一个名人数据集,我想在该数据集上微调keras内置模型.到目前为止,我已经进行了探索和完成的工作,我们移除了原始模型的顶层(或者最好传递include_top = False)并添加自己的图层,然后训练新添加的图层,同时保持先前的图层冻结.这整个过程很像直观.

I have a celebrity dataset on which I want to fine-tune a keras built-in model. SO far what I have explored and done, we remove the top layers of the original model (or preferably, pass the include_top=False) and add our own layers, and then train our newly added layers while keeping the previous layers frozen. This whole thing is pretty much like intuitive.

现在我需要的是,我的模型学会识别名人脸,同时还能够检测以前训练过的所有其他物体.最初,在imagenet上训练的模型带有1000个神经元的输出层,每个神经元代表一个单独的类.我对应该如何检测新类感到困惑?所有的转移学习和微调文章以及博客都告诉我们,用不同的N神经元层(N =新类数)替换原始的1000神经元输出层.就我而言,我有两个名人,所以如果我有一个带有2个神经元的新层,我不知道该模型如何对原始的1000个imagenet对象进行分类.

Now what I require is, that my model learns to identify the celebrity faces, while also being able to detect all the other objects it has been trained on before. Originally, the models trained on imagenet come with an output layer of 1000 neurons, each representing a separate class. I'm confused about how it should be able to detect the new classes? All the transfer learning and fine-tuning articles and blogs tell us to replace the original 1000-neuron output layer with a different N-neuron layer (N=number of new classes). In my case, I have two celebrities, so if I have a new layer with 2 neurons, I don't know how the model is going to classify the original 1000 imagenet objects.

在这整个过程中,我需要一个指针,那就是我怎样才能拥有一个经过预先训练的模型来教授两个新的名人面孔,同时还要保持其识别所有1000个imagenet对象的能力.

I need a pointer on this whole thing, that how exactly can I have a pre-trained model taught two new celebrity faces while also maintaining its ability to recognize all the 1000 imagenet objects as well.

谢谢!

推荐答案

CNN在针对新领域的新任务进行再培训时,容易忘记先前学到的知识,这种现象通常被称为灾难性遗忘,这是一个活跃而具有挑战性的研究领域.

CNN's are prone to forgetting the previously learned knowledge when retrained for a new task on a novel domain and this phenomenon is often called catastrophic forgetting, which is an active and challenging research domain.

至此,使模型能够对新类和旧类进行分类的一种显而易见的方法是从头开始对累积的(旧+新)数据集进行训练(这很耗时.)

Coming to the point, one obvious way to enable a model to classify new classes along with old classes is to train from scratch on the accumulated (old+new) dataset (which is time consuming.)

相反,近年来,在(逐级递增)持续学习的文献中提出了几种替代方法来解决这种情况:

In contrast, several alternative approaches have been proposed in the literature of (class-incremental) continual learning to tackle this scenario in the recent years:

  1. 首先,您可以使用旧数据集的一小部分以及新数据集来训练您的新模型,称为基于排练的方法.请注意,您可以训练GAN生成旧类的伪样本,而不是存储原始样本的子集.如图所示,在训练过程中,蒸馏损失用于模仿旧模型对新模型的预测(重量卷曲),这有助于避免忘记旧知识:
  2. 第二,由于模型中每个神经元的贡献不相等,因此在训练新模型时,您可以只更新对于旧类不太重要的神经元,以便我们可以保留旧知识.您可以查看弹性重量合并(EWC)文件以了解更多详细信息.
  3. 第三,您可以动态地扩展模型以提取特定于新类的功能,而不会损害对旧类重要的权重.您可以查看动态可扩展网络(DEN)了解更多详细信息.
  1. Firstly, you can use a small subset of the old dataset along with the new dataset to train your new model, refered as rehearsal-based approach. Note that you can train a GAN to generate pseudo samples of old classes instead of storing a subset of raw samples. As depicted in the figure, while training, distillation loss is used to mimic the prediction of old model (weight is frizzed) to the new model and it helps to avoid forgetting old knowledge:
  2. Secondly, as the contributions of each neuron in a model are not equal, while training the new model you may instead only update neurons that are less important for old classes so that we can retain old knowledge. You can check out the Elastic Weight Consolidation (EWC) paper for more details.
  3. Thirdly, you can grow your model dynamically to extract features that are specific for new classes without harming the weights that are important for old classes. You can check out Dynamically Extendable Network (DEN) for more details.

这篇关于如何用现有的和较新的类微调Keras模型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆