如何从 TFRecordData 取回原始字符串数据 [英] How to get original string data back from TFRecordData

查看：43 发布时间：2021/9/5 19:05:24 python string tensorflow tfrecord

本文介绍了如何从 TFRecordData 取回原始字符串数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我按照 Tensorflow 指南使用以下方法保存我的字符串数据:

I followed Tensorflow guide to save my string data using:

def _create_string_feature(values):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[values.encode('utf-8')]))

我还使用了 ["tf.string", "FixedLenFeature"] 作为我的特征原始类型，以及 "tf.string" 作为我的特征转换类型.

I also used ["tf.string", "FixedLenFeature"] as my feature original type, and "tf.string" as my feature convert type.

但是，在训练期间，当我运行会话并创建迭代器时，批量大小为 2 的字符串特征(例如:['foodfruit', 'cupcake food'])将如下所示.问题是这个列表的大小是 1，而不是 2(batch_size=2)，为什么一批中的实例粘在一起而不是分裂?

However, during my training when I run my session and I create iterators, my string feature for a batch size of 2 (for example: ['food fruit', 'cupcake food' ]) would be like below. The problem is that this list is of size 1, and not 2 (batch_size=2), why instances in one batch are stick together rather than being splitted?

[b'food fruit' b'cupcake food']

对于我的其他 int 或 float 特征，它们是凹凸不平的形状数组 (batch_size, feature_len)，这很好，但不确定为什么字符串特征没有在单个批次中分开?

For my other features which are int or float, they are bumpy arrays of shape (batch_size, feature_len) which are fine but not sure why string features are not separated in a single batch?

任何帮助将不胜感激.

推荐答案

这会将 BytesList 或 bytes_list 字符串对象转换为字符串:

This will convert a BytesList or bytes_list string object to a string:

my_bytes_list_object.value[0].decode()

或者，如果从 TFRecord Example 对象中提取字符串:

Or, in the case one is extracting the string from a TFRecord Example object:

my_example.features.feature['MyFeatureName'].bytes_list.value[0].decode()

据我所知，bytes_list 返回一个 BytesList 对象，我们可以从中读取 value 字段.这将返回一个 RepeatedScalarContainer，它的操作就像一个简单的 list 对象.事实上，如果你用 list() 操作把它包装起来，它就会把它转换成一个列表.但是，我们可以像访问列表一样访问它，并使用 [0] 来获取第零个项目.返回的项是一个 bytes 数组，可以使用 decode() 方法将其转换为标准的 str 对象.

From what I can see, bytes_list returns a BytesList object, from which we can read the value field. This will return a RepeatedScalarContainer, which operates like a simple list object. In fact, if you wrap it with the list() operation it will convert it to a list. However, instead we can just access it as if it were a list and use [0] to get the zeroth item. The returned item is a bytes array, which can be converted to a standard str object with the decode() method.

这篇关于如何从 TFRecordData 取回原始字符串数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何从 TFRecordData 取回原始字符串数据 [英] How to get original string data back from TFRecordData

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何从 TFRecordData 取回原始字符串数据 [英] How to get original string data back from TFRecordData

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭