Pandas DataFrame 和 Keras [英] Pandas DataFrame and Keras
问题描述
我正在尝试使用 Keras 在 Python 中执行情绪分析.为此,我需要对我的文本进行词嵌入.当我尝试将数据拟合到我的模型时出现问题:
model_1 = Sequential()model_1.add(Embedding(1000,32, input_length = X_train.shape[0]))model_1.add(Flatten())model_1.add(Dense(250, activation='relu'))model_1.add(Dense(1, activation='sigmoid'))model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
我的火车数据的形状是
(4834,)
并且是 Pandas 系列对象.当我尝试拟合我的模型并使用其他一些数据对其进行验证时,我收到此错误:
model_1.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=2, batch_size=64,verbose=2)
<块引用>
ValueError:检查模型输入时出错:预期embedding_1_input 有形状(无,4834)但有形状的数组(4834, 1)
如何重塑我的数据以使其适合 Keras?我一直在尝试使用 np.reshape 但我无法使用该函数放置 None 元素.
提前致谢
None
是进入训练的预期行数,因此您无法定义它.此外,Keras 需要一个 numpy 数组作为输入,而不是 Pandas 数据框.首先使用 df.values
将 df 转换为 numpy 数组,然后执行 np.reshape((-1, 4834))
.请注意,您应该使用 np.float32
.如果您在 GPU 上训练它,这一点很重要.
I'm trying to perform a sentiment analysis in Python using Keras. To do so, I need to do a word embedding of my texts. The problem appears when I try to fit the data to my model:
model_1 = Sequential()
model_1.add(Embedding(1000,32, input_length = X_train.shape[0]))
model_1.add(Flatten())
model_1.add(Dense(250, activation='relu'))
model_1.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
The shape of my train data is
(4834,)
And is a Pandas series object. When I try to fit my model and validate it with some other data I get this error:
model_1.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=2, batch_size=64, verbose=2)
ValueError: Error when checking model input: expected embedding_1_input to have shape (None, 4834) but got array with shape (4834, 1)
How can I reshape my data to make it suited for Keras? I've been trying with np.reshape but I cannot place None elements with that function.
Thanks in advance
None
is the number of expected rows that goes into training therefore you can't define it. Also Keras needs a numpy array as input and not a pandas dataframe. First convert the df to a numpy array with df.values
and then do np.reshape((-1, 4834))
. Note that you should use np.float32
. This is important if you train it on GPU.
这篇关于Pandas DataFrame 和 Keras的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!