ValueError:数据基数不明确 [英] ValueError: Data cardinality is ambiguous
问题描述
我正在尝试对从DataFrame提取的数据进行LSTM网络训练.
I'm trying to train LSTM network on data taken from a DataFrame.
代码如下:
x_lstm=x.to_numpy().reshape(1,x.shape[0],x.shape[1])
model = keras.models.Sequential([
keras.layers.LSTM(x.shape[1], return_sequences=True, input_shape=(x_lstm.shape[1],x_lstm.shape[2])),
keras.layers.LSTM(NORMAL_LAYER_SIZE, return_sequences=True),
keras.layers.LSTM(NORMAL_LAYER_SIZE),
keras.layers.Dense(y.shape[1])
])
optimizer=keras.optimizers.Adadelta()
model.compile(loss="mse", optimizer=optimizer)
for i in range(150):
history = model.fit(x_lstm, y)
save_model(model,'tmp.rnn')
此操作失败
ValueError: Data cardinality is ambiguous:
x sizes: 1
y sizes: 99
Please provide data which shares the same first dimension.
当我将模型更改为
model = keras.models.Sequential([
keras.layers.LSTM(x.shape[1], return_sequences=True, input_shape=x_lstm.shape),
keras.layers.LSTM(NORMAL_LAYER_SIZE, return_sequences=True),
keras.layers.LSTM(NORMAL_LAYER_SIZE),
keras.layers.Dense(y.shape[1])
])
它失败并出现以下错误:
it fails with following error:
Input 0 of layer lstm_9 is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 1, 99, 1200]
我如何使它工作?
x的形状为(99, 1200)
(每个具有99个特征的1200个特征,这只是一个较大的数据集的样本),y的形状为(99, 1)
x has shape of (99, 1200)
(99 items with 1200 features each, this is just sample a larger dataset), y has shape (99, 1)
推荐答案
如Error
所示,X
和y
的First Dimension
是不同的. First Dimension
表示Batch Size
,并且应该相同.
As the Error
suggests, the First Dimension
of X
and y
is different. First Dimension
indicates the Batch Size
and it should be same.
请确保Y
还具有shape
,(1, something)
.
我可以使用以下代码重现您的错误:
I could reproduce your error with the Code shown below:
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM
import tensorflow as tf
import numpy as np
# define sequences
sequences = [
[1, 2, 3, 4],
[1, 2, 3],
[1]
]
# pad sequence
padded = pad_sequences(sequences)
X = np.expand_dims(padded, axis = 0)
print(X.shape) # (1, 3, 4)
y = np.array([1,0,1])
#y = y.reshape(1,-1)
print(y.shape) # (3,)
model = Sequential()
model.add(LSTM(4, return_sequences=False, input_shape=(None, X.shape[2])))
model.add(Dense(1, activation='sigmoid'))
model.compile (
loss='mean_squared_error',
optimizer=tf.keras.optimizers.Adam(0.001))
model.fit(x = X, y = y)
如果我们遵守Print
声明,
Shape of X is (1, 3, 4)
Shape of y is (3,)
可以通过取消注释行y = y.reshape(1,-1)
来解决此错误,这会使X
和y
的First Dimension
(Batch_Size
)等于( 1
)
This Error can be fixed by uncommenting the Line, y = y.reshape(1,-1)
, which makes the First Dimension
(Batch_Size
) equal (1
) for both X
and y
.
现在,下面显示了工作代码以及输出:
Now, the working code is shown below, along with the Output:
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM
import tensorflow as tf
import numpy as np
# define sequences
sequences = [
[1, 2, 3, 4],
[1, 2, 3],
[1]
]
# pad sequence
padded = pad_sequences(sequences)
X = np.expand_dims(padded, axis = 0)
print('Shape of X is ', X.shape) # (1, 3, 4)
y = np.array([1,0,1])
y = y.reshape(1,-1)
print('Shape of y is', y.shape) # (1, 3)
model = Sequential()
model.add(LSTM(4, return_sequences=False, input_shape=(None, X.shape[2])))
model.add(Dense(1, activation='sigmoid'))
model.compile (
loss='mean_squared_error',
optimizer=tf.keras.optimizers.Adam(0.001))
model.fit(x = X, y = y)
以上代码的输出为:
Shape of X is (1, 3, 4)
Shape of y is (1, 3)
1/1 [==============================] - 0s 1ms/step - loss: 0.2588
<tensorflow.python.keras.callbacks.History at 0x7f5b0d78f4a8>
希望这会有所帮助.学习愉快!
Hope this helps. Happy Learning!
这篇关于ValueError:数据基数不明确的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!