如何读取数据到Tensorflow? [英] How to read data into Tensorflow?
问题描述
我尝试从CSV文件读取数据以张量流,
I'm trying to read data from CSV files to tensorflow,
官方文档中的示例代码如下:
The sample code in official document is like this:
col1, col2, col3, col4, col5 = tf.decode_csv(value, record_defaults=record_defaults)
要读取文件,我需要知道文件中有多少列和行,如果有1000列,我需要定义1000个变量,如 col1,col2,col3,col4,col5,...,col1000,
这看起来不像是一种有效的读取数据的方式。
To read the file, I need to know how many columns and lines in the file beforehand, and if there are 1000 columns, I need to define 1000 variables like col1, col2, col3, col4, col5,..., col1000 ,
this doesn't look like an efficient way to read data.
我的问题
-
将CSV文件读入Tensorflow的最佳方法是什么?
What is the best way to read CSV files into Tensorflow ?
有没有办法在Tensorflow中读取数据库(如mongoDB)?
Is there any way to read Database (such as mongoDB) in Tensorflow ?
推荐答案
-
你肯定不需要定义col1,col2到col1000 ...
You definitely don't need to define col1, col2, to col1000...
一般来说,您可以这样做:
generally, you might do things like this:
columns = tf.decode_csv(value, record_defaults=record_defaults)
features = tf.pack(columns)
do_whatever_you_want_to_play_with_features(features)
我不知道从MongoDB直接读取数据的任何现成的方式。也许你可以写一个简短的脚本来转换MongoDB中的数据,Tensorflow支持的格式,我建议二进制形式 TFRecord
,这比csv记录要快得多。 这是关于此主题的一个很好的博客文章。或者,您也可以自行选择实施自定义数据阅读器,请参阅此处的官方文档。
I do not know any off-the-shelf way to directly read data from MongoDB. Maybe you can just write a short script to convert data from MongoDB in a format that Tensorflow supports, I would recommend binary form TFRecord
, which is much faster to read than csv record. This is a good blog post about this topic. Or you can choose to implement a customized data reader by yourself, see the official doc here.
这篇关于如何读取数据到Tensorflow?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!