Tensorflow的TripletSemiHardLoss和TripletHardLoss如何使用,以及如何与Siamese Network一起使用? [英] How does the Tensorflow's TripletSemiHardLoss and TripletHardLoss and how to use with Siamese Network?
问题描述
据我所知,Triplet Loss
是一个损失函数,它减小了锚点和正点之间的距离,但减小了锚点和负点之间的距离.另外,还添加了边距.
As much as I know that Triplet Loss
is a Loss Function which decrease the distance between anchor and positive but decrease between anchor and negative. Also, there is a margin added to it.
因此,让我们举个例子吧,假设:一个Siamese Network
,它提供了嵌入内容:
So for EXAMPLE LEt us Suppose: a Siamese Network
, which gives embeddings:
anchor_output = [1,2,3,4,5...] # embedding given by the CNN model
positive_output = [1,2,3,4,4...]
negative_output= [53,43,33,23,13...]
我认为我可以得到三重态损失,例如:(我认为我必须使用Lambda Layer使其成为损失)
And I think I can get the triplet loss such as: (I think I have to make it as loss using Lambda Layer or so)
# calculate triplet loss
d_pos = tf.reduce_sum(tf.square(anchor_output - positive_output), 1)
d_neg = tf.reduce_sum(tf.square(anchor_output - negative_output), 1)
loss = tf.maximum(0., margin + d_pos - d_neg)
loss = tf.reduce_mean(loss)
那么到底是什么: tfa.losses.TripletHardLoss 和 tfa.losses.TripletSemiHardLoss
So what on the earth is: tfa.losses.TripletHardLoss and tfa.losses.TripletSemiHardLoss
据我所知,Semi和hard是Siamese Techniques
的数据生成技术类型,可推动模型学习更多信息.
As much as I know, Semi and hard are type of data generation techniques for Siamese Techniques
which push the model to learn more.
我的思考:据我在这篇文章,我想您可以做到:
MY Thinking: As I have learned it in This Post, I think you can do:
- 生成一批说3张图片,并制作3张包含
27
张图片的图片 - 丢弃每对无效对(所有i,j,k都应该是唯一的).剩余批次
B
- 批量获取
B
中每一对的嵌入
- Generate a Batch of say 3 images and make a pair of 3 having
27
images - Discard every invalid pair (all i,j,k should be unique). Remaining Batch
B
- Get the embeddings on each pair in batch
B
因此,我认为HardTripletLoss
每批仅考虑这3张图像,这些图像具有最大锚定正距离和最低锚定负距.
So I think HardTripletLoss
takes account of only those 3 images per batch which had Biggest Anchor-Positive distance and Lowest Anchor- Negative distance.
对于Semi Hard
,我认为它会丢弃距离为0的每个图像对所计算出的所有损耗.
And for Semi Hard
, I think it discards all the losses calculated by every image pair where the distance was 0.
如果没有,有人可以指正我,并告诉我如何使用它们. (我知道我们可以在model.complie()
内部使用它,但是我的问题有所不同.
if not, Could someone please correct me and tell me how these can be used. (I know we can use it inside model.complie()
but my question is different.
推荐答案
什么是TripletHardLoss
?
此损失遵循普通的TripletLoss
形式,但是在计算损失时使用批次中的最大正距离和最小负距离以及边距常数,如公式中所示:
What is TripletHardLoss
?
This loss follow the ordinary TripletLoss
form, but using the maximum positive distance and minimum negative distance plus the margin constant within the batch when computing the loss, as we can see in the formula:
查看源代码<我们可以看到tfa.losses.TripletHardLoss
的/a>完全是上面的公式实现的:
Look into source code of tfa.losses.TripletHardLoss
we can see above formula been implement exactly:
# Build pairwise binary adjacency matrix.
adjacency = tf.math.equal(labels, tf.transpose(labels))
# Invert so we can select negatives only.
adjacency_not = tf.math.logical_not(adjacency)
adjacency_not = tf.cast(adjacency_not, dtype=tf.dtypes.float32)
# hard negatives: smallest D_an.
hard_negatives = _masked_minimum(pdist_matrix, adjacency_not)
batch_size = tf.size(labels)
adjacency = tf.cast(adjacency, dtype=tf.dtypes.float32)
mask_positives = tf.cast(adjacency, dtype=tf.dtypes.float32) - tf.linalg.diag(
tf.ones([batch_size])
)
# hard positives: largest D_ap.
hard_positives = _masked_maximum(pdist_matrix, mask_positives)
if soft:
triplet_loss = tf.math.log1p(tf.math.exp(hard_positives - hard_negatives))
else:
triplet_loss = tf.maximum(hard_positives - hard_negatives + margin, 0.0)
# Get final mean triplet loss
triplet_loss = tf.reduce_mean(triplet_loss)
请注意,tfa.losses.TripletHardLoss
中的soft
参数不是不是,请使用以下公式计算普通的TripletLoss
:
Note the soft
parameter in tfa.losses.TripletHardLoss
are not using following formula to calculate the ordinary TripletLoss
:
由于我们在上面的源代码中可以看到,它仍然使用最大正距离和最小负距离,因此是否使用软边距来确定
Because as we can see in above source code, it still using maximum positive distance and minimum negative distance, it determine using the soft margin or not
此损耗也遵循普通的TripletLoss
形式,正距离与普通TripletLoss
相同,负距离使用半硬负:
This loss also follow the ordinary TripletLoss
form, positive distances is same as in ordinary TripletLoss
and negative distance using semi-hard negative:
最小负距离,其中至少大于 正距离加上边距常数,如果没有,则为负 存在,而是使用最大的负距离.
Minimum negative distance among which are at least greater than the positive distance plus the margin constant, if no such negative exists, uses the largest negative distance instead.
即我们要首先找到满足以下条件的负距离:
i.e we want first find negative distance that satisfies following condition:
p
表示正,n
表示负,如果wan无法找到满足此条件的负距离,则使用最大的负距离.
p
for positive and n
for negative, if wan can't find the negative distance that satisfies this condition then we using largest negative distance instead.
As we can see above condition process clear in source code of tfa.losses.TripletSemiHardLoss
, where negatives_outside
is distance that satisfies this condition and negatives_inside
is largest negative distance:
# Build pairwise binary adjacency matrix.
adjacency = tf.math.equal(labels, tf.transpose(labels))
# Invert so we can select negatives only.
adjacency_not = tf.math.logical_not(adjacency)
batch_size = tf.size(labels)
# Compute the mask.
pdist_matrix_tile = tf.tile(pdist_matrix, [batch_size, 1])
mask = tf.math.logical_and(
tf.tile(adjacency_not, [batch_size, 1]),
tf.math.greater(
pdist_matrix_tile, tf.reshape(tf.transpose(pdist_matrix), [-1, 1])
),
)
mask_final = tf.reshape(
tf.math.greater(
tf.math.reduce_sum(
tf.cast(mask, dtype=tf.dtypes.float32), 1, keepdims=True
),
0.0,
),
[batch_size, batch_size],
)
mask_final = tf.transpose(mask_final)
adjacency_not = tf.cast(adjacency_not, dtype=tf.dtypes.float32)
mask = tf.cast(mask, dtype=tf.dtypes.float32)
# negatives_outside: smallest D_an where D_an > D_ap.
negatives_outside = tf.reshape(
_masked_minimum(pdist_matrix_tile, mask), [batch_size, batch_size]
)
negatives_outside = tf.transpose(negatives_outside)
# negatives_inside: largest D_an.
negatives_inside = tf.tile(
_masked_maximum(pdist_matrix, adjacency_not), [1, batch_size]
)
semi_hard_negatives = tf.where(mask_final, negatives_outside, negatives_inside)
loss_mat = tf.math.add(margin, pdist_matrix - semi_hard_negatives)
mask_positives = tf.cast(adjacency, dtype=tf.dtypes.float32) - tf.linalg.diag(
tf.ones([batch_size])
)
# In lifted-struct, the authors multiply 0.5 for upper triangular
# in semihard, they take all positive pairs except the diagonal.
num_positives = tf.math.reduce_sum(mask_positives)
triplet_loss = tf.math.truediv(
tf.math.reduce_sum(
tf.math.maximum(tf.math.multiply(loss_mat, mask_positives), 0.0)
),
num_positives,
)
如何使用这些损失?
两个损耗期望y_true
都将作为形状为[batch_size]的多类整数标签的1-D整数Tensor
提供.嵌入y_pred
必须是l2归一化嵌入向量的二维浮点Tensor
.
How to use those loss?
Both loss expect y_true
to be provided as 1-D integer Tensor
with shape [batch_size] of multi-class integer labels. And embeddings y_pred
must be 2-D float Tensor
of l2 normalized embedding vectors.
准备输入和标签的示例代码:
Example code to prepare the inputs and labels:
import tensorflow as tf
import tensorflow_addons as tfa
import tensorflow_datasets as tfds
def _normalize_img(img, label):
img = tf.cast(img, tf.float32) / 255.
return (img, label)
train_dataset, test_dataset = tfds.load(name="mnist", split=['train', 'test'], as_supervised=True)
# Build your input pipelines
train_dataset = train_dataset.shuffle(1024).batch(16)
train_dataset = train_dataset.map(_normalize_img)
# Take one batch of data
for data in train_dataset.take(1):
print("Batch of images shape:\n{}\nBatch of labels:\n{}\n".format(data[0].shape, data[1]))
输出:
Batch of images shape:
(16, 28, 28, 1)
Batch of labels:
[8 4 0 3 2 4 5 1 0 5 7 0 2 6 4 9]
遵循此关于如何在其中使用TripletSemiHardLoss
(以及TripletHardLoss
)的官方教程一般,如果您在使用时遇到问题.
Following this official tutorial about how to using TripletSemiHardLoss
(TripletHardLoss
as well) in general if you have problem when using it.
这篇关于Tensorflow的TripletSemiHardLoss和TripletHardLoss如何使用,以及如何与Siamese Network一起使用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!