理解 Tensorflow 数据集的结构 [英] Understanding the structure of Tensorflow datasets

查看:32
本文介绍了理解 Tensorflow 数据集的结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过将数据从 Postgres 导入 Tensorflow2.0 来试验 tensorflow_datasets.我有一个包含大约 50 条记录的 aml2 表.该表具有三列 v1、v2 和 class.我希望 class 列是我们的标签,v1 和 v2 作为特征.v1 和 v2 已经是规范化的浮点值并且类是整数(0 或 1).

I am trying to experiment with tensorflow_datasets by importing data into Tensorflow2.0 from Postgres. I have a table aml2 with around 50 records. The table has three columns v1, v2 and class. I want class column to be our label and v1 and v2 as features. v1 and v2 are already normalized float values and class is integer(0 or 1).

##initial connections are defined for Postgres##
##I am able to get data. That isn't a problem##
dataset = tfio.experimental.IODataset.from_sql(
query="SELECT  v1, v2, class FROM aml2;",
endpoint=endpoint)

print(dataset.element_spec)

dataset = dataset.map(lambda item: ((item['v1'], item['v2']), item['class']))##Here I am trying to separate out features and labels
dataset = dataset.prefetch(1)


model = tf.keras.models.Sequential([
    layers.Flatten(),
    layers.Dense(20, activation='relu'),
    layers.Dense(1, activation='sigmoid')

])
model.compile('adam','binary_crossentropy',['accuracy'])

model.fit(dataset, epochs=10)

ValueError: Layer sequential expects 1 inputs, but it received 2 input tensors. Inputs received: 
[<tf.Tensor 'IteratorGetNext:0' shape=() dtype=float32>, <tf.Tensor 'IteratorGetNext:1' shape=() 
dtype=float32>]

我的问题是如何解决.以及一般来说,如何理解tensorflow_datasets或IODatasets的形状.我不需要有 train_data 和标签吗?Tensorflow 数据集有点令人困惑.谁能解释一下.

My question is how do I resolve it. And generally, how to understand the shape of tensorflow_datasets or IODatasets. Don't I need to have a train_data and label? Tensorflow datasets are a little confusing. Can somebody please explain.

推荐答案

不将输入和标签分开会更容易.相反,您可以命名您的输入层并使用 tf.keras 函数式 API.

It would be easier to not separate your inputs and labels. Instead you could name your Inputs layers and use tf.keras Functional API.

变化:

  • 摆脱了将 FEATURES 与 LABEL 分开的 map 方法
  • 在 tf.keras 中更改了代码以使用函数式 API 而不是顺序 API
  • 命名所有输入和输出
  • model.fit 方法也适用于具有与输入和输出名称匹配的键的字典结构.请参阅文档此处了解参数xy.
  • Got rid of the map method which splits FEATURES from LABEL
  • Changed the code to use functional API instead of sequential API in tf.keras
  • Named all inputs and Outputs
  • model.fit method also works on a dictionary structure with keys matching with input and output names. See the documentation here for arguments x and y.

以下是要进行的更改.

from tensorflow.keras import Model

dataset = tfio.experimental.IODataset.from_sql(
query="SELECT  v1, v2, class FROM aml2;",
endpoint=endpoint)

print(dataset.element_spec)

dataset = dataset.prefetch(1).batch(BATCH_SIZE)

input_layers = []

for col in FEATURES:
    input_ = layers.Input(shape=(1), name=col)
    concat_layers.append(input_)

input_layers.append(input_)

x = Concatenate()(input_layers)

x = layers.Flatten()(x)
x = layers.Dense(20, activation='relu')(x)
output = layers.Dense(1, activation='sigmoid', name=LABEL)(x)

model = Model(input_layers, output)
model.compile('adam','binary_crossentropy',['accuracy'])

model.fit(dataset, epochs=10)

这篇关于理解 Tensorflow 数据集的结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆