衡量新样本如何促进数据集的多样性 [英] Measuring how a new sample contributes to the diversity of a dataset

查看:566
本文介绍了衡量新样本如何促进数据集的多样性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理灰度图像数据集.有没有一种方法可以确定新的灰度图像是否有助于灰度图像数据集的多样性?我想防止数据集中有太多相似的样本.

I am working with grayscale images dataset. Is there a way to determine a new grayscale image can contribute to the diversity of a greyscale images dataset? I would like to prevent the dataset of having too many similar samples.

推荐答案

好吧,当您看到它时会看到什么?如果您有关于此数据集中图像的信息,则您自己可以评估该新样本是否是数据集中已经包含的某种模式的重复,或者它是否是唯一的.

Well, what do you see when you look at it? If you have information about the images in this dataset, you yourself can probably assess whether this new sample is a repetition of some pattern that is already included in the dataset, or if it is something unique.

另一个想法可能是分析比较图像.根据情况,您可能需要查看训练集的各个像素平均值(每个像素平均值应在0到255之间),并将其与该样本图像的像素值进行比较.同样,其他措施也可能起作用.

Another idea might be to compare the images analytically. Depending on the case, you may want to look at the individual pixel averages (each should be between 0 and 255) of your training set and compare it with the pixel values of this sample image. Similarly, other measures may also work.

如果您在当前数据集中训练了一个模型,我会做的是,使用该模型对样本图像进行预测/分类,查看其效果如何,以及其信心如何.这样,也许您可​​以评估您的模型(以及您用来训练的数据集)是否可以从新的样本图像中学习一些东西.

What I would do is, if you have a model trained on your current dataset, to use the model to predict/classify the sample image, see how well it performs, and with what confidence it performs. This way, perhaps you can assess whether your model (and the dataset you trained it with) have something to learn from this new sample image.

这篇关于衡量新样本如何促进数据集的多样性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆