Keras:LSTM的分类与连续输入 [英] Keras: Categorical vs Continuous input to a LSTM

查看:423
本文介绍了Keras:LSTM的分类与连续输入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Keras和深度学习的新手,在完成了关于stackoverflow的一些教程和答案之后,我仍然不清楚进入网络后如何操纵输入.

I am new to Keras and deep learning and after going through several tutorials and answers on stackoverflow, I am still unclear about how the input is manipulated once entering the network.

我正在使用keras的功能API来开发复杂的模型,所以我的第一层始终是输入层. 像这样:

I am using the functional API of keras to develop complex models, so my first layer is always input layer. Something like:

Input()

Input()

LSTM()

Dense()

现在让我们说我有2个训练数据集A和B.每个数据集都是10,000 x 6,000相同的矩阵,其中有200个不同的值.即10,000行,每个代表训练示例,6,000个时间步序列.两者中的值均为[[3,50,1,22,7,5,3,1,5,..],[55,32,54,21,15,...],....] A和B之间的唯一区别是A中的值是实数(连续变量),而B的值是离散(分类变量).

Now lets say I have 2 training datasets A and B. Each dataset is identical 10,000 by 6,000 matrix with 200 distinct values in it. i.e 10,000 rows each representing the training examples and 6,000 sequences of time steps. The values in both are [[3,50,1,22,7,5,3,1,5,..], [55,32,54,21,15, ...], .... ] The only difference between A and B is the the values in A are real numbers (continuous variable), and the values B are discreet (categorical variables).

我有以下3种可能的选项,可用于区分分类输入和连续输入,并想问一下其中哪些将起作用,哪些会比其他更好.

I have the following 3 possible options which I can use to differentiate between categorical and continuous input and wanted to ask which of these will work, and which are better then others.

1-假设A是实值,B是绝对值,则将A转换为.astype(float),将B转换为.astype(float),然后馈入网络,网络将据此承担.

1- Given A is real valued and B is categorical, convert A to .astype(float) and B to .astype(float) and feed to the network and the network will assume accordingly.

2-鉴于B具有分类值,请将B转换为一个热矢量设置,即将10,000 x 6,000更改为10,000 x 6,000 x200.保持原样.

2- Given B has categorical values, convert B to a one hot vector setting i.e changing 10,000 by 6,000 to 10,000 by 6,000 by 200. Keep A as it is.

3-如果我们使用的是B,则在输入后添加一个嵌入层,并使网络类似:

3- If we are using B then add an embedding layer after input and making the network like:

Input()

Input()

Embedding()

LSTM()

Dense()

如果我们使用的是A,则不要添加嵌入层.

If we are using A then don't add embedding layer.

推荐答案

分类输入似乎使您感到困惑.要嵌入还是不嵌入:

It seems the categorical input is confusing you. To embed or not to embed:

  1. 在两种情况下,我们使用Embedding层来嵌入分类输入:减小空间的尺寸并捕获输入之间的任何相似之处.因此,当您在一种语言中有数十亿个单词时,将其嵌入300维向量以使其易于管理是很有意义的.但是一口气"总是能提供最大的区别,因此,在您的情况下,每说200个字并不是很大,而一口气"是最好的选择.
  2. 对于连续输入,我们通常使用简单的max-min归一化进行 normalise ,因此max变为1,min变为0.但是有很多方法可以实现,具体取决于数据集的性质
  3. 对于实际模型,您可以使用2个输入来处理连续和类别不同的​​内容,并可能在上游共享图层,否则创建2个不同的模型可能很有意义.
  1. We embed categorical input using an Embedding layer for two cases: reduce the dimension of the space and capture any similarities between the input. So when you have billions of words in a language, it makes sense to embed into 300 dimensional vector to make it manageable. But one-hot always gives the most distinction, so in your case 200 is not a large number per say and one-hot is the way to go.
  2. For the continuous input, we normalise often with a simple max-min normalisation, so max becomes 1 and min becomes 0. But there are many ways of doing it depending on the nature of your dataset.
  3. For the actual model, you can 2 inputs that process continuous and categorical different and maybe share layers upstream, otherwise creating 2 different models might make sense.

您可以找到更多信息

You can find more information online that cover input encoding.

这篇关于Keras:LSTM的分类与连续输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆