Keras:模型的正确性和自定义指标存在的问题 [英] Keras: correctness of model and issues with custom metric

查看:63
本文介绍了Keras:模型的正确性和自定义指标存在的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为进程创建自动编码器.每个过程都是一个事件序列,每个事件都表示为0到461之间的数字(重要的是,具有接近数字的事件并不相似,数字是随机分配的).每个进程的长度为60,进程总数为n.所以我的输入数据是数组(n, 60).

I'm trying to create the autoencoder for processes. Each process is a sequence of events and each event represents as number from 0 to 461 (and important, that events with close numbers are not similar, numbers were given out randomly). Each process has length 60 and total count of processes is n. So my input data is array (n, 60).

首先,我创建了Embedding层,以将事件编号转换为单发表示:

First, I created the Embedding layer to convert events numbers to one-hot representation:

BLOCK_LEN = 60
EVENTS_CNT = 462

input = Input(shape=(BLOCK_LEN,))
embedded = Embedding(input_dim=EVENTS_CNT+1, input_length=BLOCK_LEN, output_dim=200)(input)
emb_model = Model(input, embedded)
emb_model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 60)                0         
_________________________________________________________________
embedding_1 (Embedding)      (None, 60, 200)           92600     
=================================================================
Total params: 92,600
Trainable params: 92,600
Non-trainable params: 0
_________________________________________________________________
None

第二,我创建了主要的Seq2Seq模型(使用该库):

Second, I created the main Seq2Seq model (using that library):

seq_model = Seq2Seq(batch_input_shape=(None, BLOCK_LEN, 200), hidden_dim=200, output_length=BLOCK_LEN, output_dim=EVENTS_CNT)

结果模型:

model = Sequential()
model.add(emb_model)
model.add(seq_model)
model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
model_1 (Model)              (None, 60, 200)           92600     
_________________________________________________________________
model_12 (Model)             (None, 60, 462)           1077124   
=================================================================
Total params: 1,169,724
Trainable params: 1,169,724
Non-trainable params: 0
_________________________________________________________________

我也有自己的准确性指标(因为lib的准确性不适用于我的数据):

Also I have my own accuracy metric (because lib's accuracy doesn't appropriate for my data):

def symbol_acc(y_true, y_pred):
    isEqual = K.cast(K.equal(y_true, y_pred), K.floatx())
    return K.mean(isEqual)

并编译:

model.compile(loss=tf.losses.sparse_softmax_cross_entropy,optimizer='adam', target_tensors=[tf.placeholder(tf.int32, [None, 60])], metrics=[symbol_acc])

为什么编译看起来像这样:首先,模型又有一个model.add(TimeDistributed(Dense(EVENTS_CNT, activation='softmax')))层,而编译是model.compile(loss=custom_categorical_crossentropy, optimizer='rmsprop', metrics=[symbol_acc]).但是这样的模型产生了一个错误"ValueError:检查目标时出错:期望time_distributed_2具有3个维度,但数组的形状为(2714,60)".现在所有形状都适合.

Why compile looks like that: At first model had one more layer model.add(TimeDistributed(Dense(EVENTS_CNT, activation='softmax'))) and compile was model.compile(loss=custom_categorical_crossentropy, optimizer='rmsprop', metrics=[symbol_acc]). But such model produced an error "ValueError: Error when checking target: expected time_distributed_2 to have 3 dimensions, but got array with shape (2714, 60)". Now all shapes are suitable.

但是现在我有了一个新的问题(我故事的关键时刻):公制symbol_acc中的形状是不正确的:

But now I have new problem (key moment of my story): shapes in metric symbol_acc are dirrefent:

形状(symbol_acc):(?,60)(?,?,462)

Shapes (symbol_acc): (?, 60) (?, ?, 462)

因此,true数组的形状为(?, 60),并且预测为-(?, ?, 462). true 60个值中的每个值都是0到461之间的数字(表示事件的真实数目),而predicted 60个值中的每个值都是0到461个之间的每个数字的概率分布的大小为462的向量(对于462个事件中的每个事件)(针对462个事件中的每个事件).我想使truepredicted的形状相同:对于60个值中的每一个,都使大小为462的矢量在事件编号位置上具有1,在其他编号位置上为0.

So the true array has shape (?, 60) and predicted - (?, ?, 462). Each value in true 60 values is a number from 0 to 461 (represents the true number of event) and each value in predicted 60 ones is a vector of size 462 of probability distribution for each number from 0 to 461 (for each of 462 events) (for each of 462 events). I want to make true the same shape as predicted: for each of 60 values make vector of size 462 with 1 on the event number position and 0s on the others.

所以我的问题:

  1. 如果在拟合模型之前我没有数据,如何更改指标中数组的形状?我得到的最大值是K.gather(K.eye(462), tf.cast(number, tf.int32)):该代码在number位置创建了一个1的热阵列.但是我不了解如何在不知道该数组的情况下将其应用于数组.
  2. 也许有更简单的方法来解决该问题?
  1. How to change shape of array in metric if before fitting model I have no data? Maxumum that I got is K.gather(K.eye(462), tf.cast(number, tf.int32)): that code creates one-hot array with 1 in number position. But I don't understand how I can apply it to array without knowing this array.
  2. Maybe there is more simple way to solve that problem?

我是keras和NN的新手,所以我不确定所有步骤都正确.如果您发现任何错误,请报告.

I'm new in keras and NNs, so I don't sure that all steps are correct. If you see any mistake please report.

推荐答案

如我之前所测试的,除非target_tensors的形状与模型的预测形状相同,否则将无法使用.

As I tested before, using target_tensors will not work unless its shape is the same as the model's predicted shape.

因此,不能违反此一般规则:

So, this general rule cannot be violated:

您的输出数据必须与模型的输出相同

Your output data must have the same shape as your model's ouptut

这使得y_truey_pred当然具有相同的形状.

This makes y_true and y_pred certainly have the same shape.

您需要使用 to_categorical()来使输出数据适应模型的形状.

from keras.utils import to_categorical
one_hot_X = to_categorical(X_train,462)

您只需简单地正常训练模型,而不必在损失和准确性上创建变通办法:

With that you simply train your model normally, without having to create workarounds in losses and accuracies:

model.fit(X_train, one_hot_X,...)


如果这样做会遇到内存问题,则可以考虑创建一个生成器,该生成器将仅转换每个批次的部分数据:


If you run into memory problems by doing this, you may consider creating a generator that will convert only part of the data for each batch:

def batch_generator(batch_size):

    while True: #keras generators must be infinite

        #you may want to manually shuffle X_train here

        for i in range(len(X_train)//batch_size): #make sure len is a multiple of batch_size

            x = X_train[i*batch_size:(i+1)*batch_size]
            y = to_categorical(x,462)

            yield (x,y)

使用以下语言进行训练:

Train with:

model.fit_generator(batch_generator(size),....)


确定这种情况下的准确性

现在我们知道您在做什么,您的准确性应该使用K.argmax来获得准确的结果(而不是考虑462个选项,而应该为1,正确与否)


Fixing your accuracy for this case

Now that we know better what you're doing, your accuracy should use K.argmax to get exact results (and not to consider 462 options while it should be 1, correct or not)

(我的旧答案是错误的,因为我忘记了y_true是准确的,但y_pred是近似的).

(My old answer was wrong, because I forgot that y_true is exact, but y_pred is approximated).

def symbol_acc(y_true, y_pred):
    y_true = K.argmax(y_true) #this gets the class as an integer (comparable to X_train)
    y_pred = K.argmax(y_pred) #transforming (any,60,462) into (any,60)

    isEqual = K.cast(K.equal(y_true, y_pred), K.floatx())
    return K.mean(isEqual)

只是一个小小的修正:

嵌入不会创建一次性"表示,它们只是创建了多功能表示. (严格来说,向量中只有一个元素是一个元素,但任何元素中的嵌入都可以自由赋值).

Embeddings don't create "one-hot" representation, they just create a multi-feature representation. (One hot is strictly for the cases where only one element in the vector is one, but embeddings are free to any value in any element).

这篇关于Keras:模型的正确性和自定义指标存在的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆