如何在Tensorflow数据集中获取类分配? [英] How to get class distribution in Tensorflow Datasets?

查看:68
本文介绍了如何在Tensorflow数据集中获取类分配?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用Tensorflow数据集(tfds)时是否可以检索类分发信息?如果我通过 with_info ,则会得到 tfds.core.DatasetInfo 信息对象,该对象具有有关数据集的一些信息(拆分,标签等).

Is it possible to retrieve class distribution information when using Tensorflow Datasets (tfds)? If I pass with_info I get the tfds.core.DatasetInfo info object which has some information about the dataset (splits, labels, etc.).

但是,我很好奇是否有可能根据此对象中包含的内容确定确切的类明智分布.

However, I'm curious if it's possible to determine the exact class wise distribution from what is contained within this object.

从我所看到的情况来看,不首先迭代数据集是不可能的.只是想仔细检查这个假设,并询问从数据集中提取此信息最快/最好的方法是什么.

From what I've seen, it's not possible without first iterating over the dataset. Just wanted to double-check this hypothesis and inquire as to what the fastest/best way would be to extract this information from the dataset.

注意:出于这个问题的目的,我只考虑对图像进行分类的数据集.

NOTE: For the purposes of this question I am only considering datasets for image classification.

推荐答案

我也无法找到获取标签分布的方法.这是另一种选择:

I wasn't able to find a way to get the label distribution either. Here's an alternative:

import tensorflow_datasets as tfds
import numpy as np

ds = tfds.load('mnist', split='train', as_supervised=True)

vals = np.unique(np.fromiter(ds.map(lambda x, y: y), float), return_counts=True)

for val, count in zip(*vals):
    print(int(val), count)

0 5923
1 6742
2 5958
3 6131
4 5842
5 5421
6 5918
7 6265
8 5851
9 5949

这篇关于如何在Tensorflow数据集中获取类分配?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆