理解 Tensorflow 数据集的结构 [英] Understanding the structure of Tensorflow datasets
问题描述
我正在尝试通过将数据从 Postgres 导入 Tensorflow2.0 来试验 tensorflow_datasets.我有一个包含大约 50 条记录的 aml2 表.该表具有三列 v1、v2 和 class.我希望 class 列是我们的标签,v1 和 v2 作为特征.v1 和 v2 已经是规范化的浮点值并且类是整数(0 或 1).
I am trying to experiment with tensorflow_datasets by importing data into Tensorflow2.0 from Postgres. I have a table aml2 with around 50 records. The table has three columns v1, v2 and class. I want class column to be our label and v1 and v2 as features. v1 and v2 are already normalized float values and class is integer(0 or 1).
##initial connections are defined for Postgres##
##I am able to get data. That isn't a problem##
dataset = tfio.experimental.IODataset.from_sql(
query="SELECT v1, v2, class FROM aml2;",
endpoint=endpoint)
print(dataset.element_spec)
dataset = dataset.map(lambda item: ((item['v1'], item['v2']), item['class']))##Here I am trying to separate out features and labels
dataset = dataset.prefetch(1)
model = tf.keras.models.Sequential([
layers.Flatten(),
layers.Dense(20, activation='relu'),
layers.Dense(1, activation='sigmoid')
])
model.compile('adam','binary_crossentropy',['accuracy'])
model.fit(dataset, epochs=10)
ValueError: Layer sequential expects 1 inputs, but it received 2 input tensors. Inputs received:
[<tf.Tensor 'IteratorGetNext:0' shape=() dtype=float32>, <tf.Tensor 'IteratorGetNext:1' shape=()
dtype=float32>]
我的问题是如何解决.以及一般来说,如何理解tensorflow_datasets或IODatasets的形状.我不需要有 train_data 和标签吗?Tensorflow 数据集有点令人困惑.谁能解释一下.
My question is how do I resolve it. And generally, how to understand the shape of tensorflow_datasets or IODatasets. Don't I need to have a train_data and label? Tensorflow datasets are a little confusing. Can somebody please explain.
推荐答案
不将输入和标签分开会更容易.相反,您可以命名您的输入层并使用 tf.keras
函数式 API.
It would be easier to not separate your inputs and labels. Instead you could name your Inputs layers and use tf.keras
Functional API.
变化:
- 摆脱了将 FEATURES 与 LABEL 分开的 map 方法
- 在 tf.keras 中更改了代码以使用函数式 API 而不是顺序 API
- 命名所有输入和输出
model.fit
方法也适用于具有与输入和输出名称匹配的键的字典结构.请参阅文档此处了解参数x
和y
.
- Got rid of the map method which splits FEATURES from LABEL
- Changed the code to use functional API instead of sequential API in tf.keras
- Named all inputs and Outputs
model.fit
method also works on a dictionary structure with keys matching with input and output names. See the documentation here for argumentsx
andy
.
以下是要进行的更改.
from tensorflow.keras import Model
dataset = tfio.experimental.IODataset.from_sql(
query="SELECT v1, v2, class FROM aml2;",
endpoint=endpoint)
print(dataset.element_spec)
dataset = dataset.prefetch(1).batch(BATCH_SIZE)
input_layers = []
for col in FEATURES:
input_ = layers.Input(shape=(1), name=col)
concat_layers.append(input_)
input_layers.append(input_)
x = Concatenate()(input_layers)
x = layers.Flatten()(x)
x = layers.Dense(20, activation='relu')(x)
output = layers.Dense(1, activation='sigmoid', name=LABEL)(x)
model = Model(input_layers, output)
model.compile('adam','binary_crossentropy',['accuracy'])
model.fit(dataset, epochs=10)
这篇关于理解 Tensorflow 数据集的结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!