如何在自定义损失函数中遍历张量? [英] How to iterate through tensors in custom loss function?

查看:246
本文介绍了如何在自定义损失函数中遍历张量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用带有tensorflow后端的keras.我的目标是在自定义损失函数中查询当前批次的batchsize.这是计算定制损失函数的值所必需的,该值取决于特定观测值的索引.考虑到以下最少的可重现示例,我想更清楚地说明这一点.

I'm using keras with tensorflow backend. My goal is to query the batchsize of the current batch in a custom loss function. This is needed to compute values of the custom loss functions which depend on the index of particular observations. I like to make this clearer given the minimum reproducible examples below.

(顺便说一句:当然,我可以使用为培训过程定义的批量大小,并在定义自定义损失函数时使用它的值,但是有一些原因导致这种变化的原因,特别是如果epochsize % batchsize(以时长为模的批量大小)是不相等的零,那么最后一个时期的大小是不同的,我没有在stackoverflow中找到合适的方法,例如 自定义损失函数中的张量索引 Keras中的Tensorflow自定义损失函数-在张量上循环在张量上循环,因为显然任何形状构建损失函数时,无法在构建图时推断张量-仅在评估给定数据时才可能进行形状推断,而仅在给定图时才可能推断形状.因此,我需要告诉自定义损失函数对沿特定维度的特定元素执行某些操作,而无需知道维度的长度.

(BTW: Of course I could use the batch size defined for the training procedure and plugin it's value when defining the custom loss function, but there are some reasons why this can vary, especially if epochsize % batchsize (epochsize modulo batchsize) is unequal zero, then the last batch of an epoch has different size. I didn't found a suitable approach in stackoverflow, especially e. g. Tensor indexing in custom loss function and Tensorflow custom loss function in Keras - loop over tensor and Looping over a tensor because obviously the shape of any tensor can't be inferred when building the graph which is the case for a loss function - shape inference is only possible when evaluating given the data, which is only possible given the graph. Hence I need to tell the custom loss function to do something with particular elements along a certain dimension without knowing the length of the dimension.

from keras.models import Sequential
from keras.layers import Dense, Activation

# Generate dummy data
import numpy as np
data = np.random.random((1000, 100))
labels = np.random.randint(2, size=(1000, 1))

model = Sequential()
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dense(1, activation='sigmoid'))

示例1:没有问题,没有特殊之处,没有自定义损失

model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])    

# Train the model, iterating on the data in batches of 32 samples
model.fit(data, labels, epochs=10, batch_size=32)

(省略输出,可以完美运行)

def custom_loss(yTrue, yPred):
    loss = np.abs(yTrue-yPred)
    return loss

model.compile(optimizer='rmsprop',
              loss=custom_loss,
              metrics=['accuracy'])

# Train the model, iterating on the data in batches of 32 samples
model.fit(data, labels, epochs=10, batch_size=32)

(省略输出,可以完美运行)

def custom_loss(yTrue, yPred):
    print(yPred) # Output: Tensor("dense_2/Sigmoid:0", shape=(?, 1), dtype=float32)
    n = yPred.shape[0]
    for i in range(n): # TypeError: __index__ returned non-int (type NoneType)
        loss = np.abs(yTrue[i]-yPred[int(i/2)])
    return loss

model.compile(optimizer='rmsprop',
              loss=custom_loss,
              metrics=['accuracy'])

# Train the model, iterating on the data in batches of 32 samples
model.fit(data, labels, epochs=10, batch_size=32)

当然,张量还没有形状信息,只有在训练时才可以在构建图形时推断出该信息.因此for i in range(n)会引发错误.有什么方法可以执行此操作吗?

Of course the tensor has not shape info yet which can't be inferred when building the graph, only at training time. Hence for i in range(n) rises an error. Is there any way to perform this?

输出的追溯:

顺便说一句,如果有任何疑问,这是我真正的自定义损失函数.为了清楚和简单起见,我在上面跳过了它.

BTW here's my true custom loss function in case of any questions. I skipped it above for clarity and simplicity.

def neg_log_likelihood(yTrue,yPred):
    yStatus = yTrue[:,0]
    yTime = yTrue[:,1]    
    n = yTrue.shape[0]    
    for i in range(n):
        s1 = K.greater_equal(yTime, yTime[i])
        s2 = K.exp(yPred[s1])
        s3 = K.sum(s2)
        logsum = K.log(y3)
        loss = K.sum(yStatus[i] * yPred[i] - logsum)
    return loss

这是Cox比例危害模型的部分负对数似然率的图像.

Here's an image of the partial negative log-likelihood of the cox proportional harzards model.

这是为了澄清注释中的一个问题,以避免混淆.我认为没有必要详细了解这一问题来回答这个问题.

This is to clarify a question in the comments to avoid confusion. I don't think it is necessary to understand this in detail to answer the question.

推荐答案

像往常一样,不要循环.存在严重的性能缺陷和错误.除非完全无法避免(通常并非不可避免),否则请仅使用后端函数

As usual, don't loop. There are severe performance drawbacks and also bugs. Use only backend functions unless totally unavoidable (usually it's not unavoidable)

所以,那里有一件很奇怪的事情……

So, there is a very weird thing there...

您真的要忽略模型预测的一半吗? (示例3)

Do you really want to simply ignore half of your model's predictions? (Example 3)

假设这是真的,只需在最后一个维度上复制张量,展平并丢弃一半即可.您将获得所需的确切效果.

Assuming this is true, just duplicate your tensor in the last dimension, flatten and discard half of it. You have the exact effect you want.

def custom_loss(true, pred):
    n = K.shape(pred)[0:1]

    pred = K.concatenate([pred]*2, axis=-1) #duplicate in the last axis
    pred = K.flatten(pred)                  #flatten 
    pred = K.slice(pred,                    #take only half (= n samples)
                   K.constant([0], dtype="int32"), 
                   n) 

    return K.abs(true - pred)

损失函数的解决方案:

如果您按从大到小的顺序对时间进行了排序,则只需进行累加即可.

Solution for your loss function:

If you have sorted times from greater to lower, just do a cumulative sum.

警告::如果每个样本一次,则无法进行迷你批次训练!!
batch_size = len(labels)

Warning: If you have one time per sample, you cannot train with mini-batches!!!
batch_size = len(labels)

像在循环和一维转换网络中那样,在附加维度上有时间(每个样本很多次)是有意义的.无论如何,考虑到您所表达的示例,对于yTime,形状为(samples_equal_times,):

It makes sense to have time in an additional dimension (many times per sample), as is done in recurrent and 1D conv netoworks. Anyway, considering your example as expressed, that is shape (samples_equal_times,) for yTime:

def neg_log_likelihood(yTrue,yPred):
    yStatus = yTrue[:,0]
    yTime = yTrue[:,1]    
    n = K.shape(yTrue)[0]    


    #sort the times and everything else from greater to lower:
    #obs, you can have the data sorted already and avoid doing it here for performance

    #important, yTime will be sorted in the last dimension, make sure its (None,) in this case
    # or that it's (None, time_length) in the case of many times per sample
    sortedTime, sortedIndices = tf.math.top_k(yTime, n, True)    
    sortedStatus = K.gather(yStatus, sortedIndices)
    sortedPreds = K.gather(yPred, sortedIndices)

    #do the calculations
    exp = K.exp(sortedPreds)
    sums = K.cumsum(exp)  #this will have the sum for j >= i in the loop
    logsums = K.log(sums)

    return K.sum(sortedStatus * sortedPreds - logsums)

这篇关于如何在自定义损失函数中遍历张量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆