如何使用KerasClassifier将LSTM用于序列分类 [英] How to use LSTM for sequence classification using KerasClassifier

查看：606 发布时间：2020/4/25 10:56:35 python tensorflow machine-learning keras scikit-learn

本文介绍了如何使用KerasClassifier将LSTM用于序列分类的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我遇到一个binary classification问题，我需要根据2010-2015期间的客户互动来预测潜在的未来流行/流行产品.

I have a binary classification problem where I need to predict the potential future trendy/popular products based on customer interactions during 2010-2015.

当前，我的数据集包含1000 products，每个产品都标记为0或1(即二进制分类).标签是根据2016-2018期间的客户互动决定的.

Currently, my dataset includes 1000 products and each product is labelled as 0 or 1 (i.e. binary classification). The label was decided based on customer interactions during 2016-2018.

我正在计算2010-2015期间的how centrality measures changed over time for each product作为我的二进制分类问题的特征.例如，考虑下图，该图显示了每种产品degree centrality随时间的变化.

I am calculating how centrality measures changed over time for each product during 2010-2015 as the features for my binary classification problem. For example, consider the below figure that shows how degree centrality changed over time for each product.

更具体地说，我分析以下centrality measures的变化作为我的二进制分类问题的特征.

More specifically, I analyse the change of following centrality measures as the features for my binary classification problem.

每种商品的degree centrality在2010-2016年之间如何变化(请参见上图)
每种商品的betweenness centrality从2010-2016年如何变化
每种商品的closeness centrality从2010-2016年如何变化
每种商品的eigenvector centrality从2010-2016年如何变化

how degree centrality of each good changed from 2010-2016 (see the above figure)
how betweenness centrality of each good changed from 2010-2016
how closeness centrality of each good changed from 2010-2016
how eigenvector centrality of each good changed from 2010-2016

简而言之，我的数据如下所示.

In a nutshell, my data looks as follows.

product, change_of_degree_centrality, change_of_betweenness_centrality, change_of_closenss_centrality, change_of_eigenvector_centrality, Label
item_1, [1.2, 2.5, 3.7, 4.2, 5.6, 8.8], [8.8, 4.6, 3.2, 9.2, 7.8, 8.6], …, 1
item_2, [5.2, 4.5, 3.7, 2.2, 1.6, 0.8], [1.5, 0, 1.2, 1.9, 2.5, 1.2], …, 0
and so on.

我想使用深度学习模型来解决我的问题.在阅读教程时，我意识到LSTM适合我的问题.

I wanted to use deep learning model to solve my issue. When reading tutorials, I realised that LSTM suits my problem.

因此，我将使用下面提到的模型进行分类.

So, I am using the below mentioned model for my classification.

model = Sequential()
model.add(LSTM(10, input_shape=(6,4))) #where 6 is length of centrality sequence and 4 is types of centrality (i.e. degree centrality, betweenness centrality, closeness centrality, and eigenvector centrality)
model.add(Dense(32))
model.add(Dense(1, activation=’sigmoid’))
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])

因为，我有一个小的数据集，我想执行10倍交叉验证.因此，我按照此教程.

Since, I have a small dataset I wanted to perform 10-fold cross-validation. So, I am using KerasClassifier as follows by following this tutorial.

print(features.shape) #(1000,6,4)
print(target.shape) #(1000) 

# Create function returning a compiled network
def create_network():
    model = Sequential()
    model.add(LSTM(10, input_shape=(6,4)))
    model.add(Dense(32))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])     

    return model

# Wrap Keras model so it can be used by scikit-learn
neural_network = KerasClassifier(build_fn=create_network, 
                                 epochs=10, 
                                 batch_size=100, 
                                 verbose=0)

print(cross_val_score(neural_network, features, target, cv=5))

但是，我指出在LSTM中使用cross validation是错误的(例如，本教程，此问题) .

However, I noted that it is wrong to use cross validation with LSTM (e.g., this tutorial, this question).

但是，我不清楚这是否适用于我，因为我只是在做 binary classification识别未来可能会流行/流行的产品(而不是预测产品).

However, I am not clear if this is applicable to me as I am only doing a binary classification predition to identify products that would be trendy/popular in future (not a forecasting).

我认为我的问题集中的数据在交叉验证中除以 point-wise ，而不是 time-wise .

I think the data in my problem setting is divided by point-wise in the cross-validation, but not time-wise.

即(逐点)

1st fold training:
item_1, item2, ........, item_799, item_800

1st fold testing:
item 801, ........, item_1000

不(按时间)

1st fold training:
2010, 2011, ........, 2015

1st fold testing:
2016, ........, 2018

由于这个事实，我认为在我的问题中使用cross validation是正确的.

Due to this fact, I am assuming that using cross validation is correct in my problem.

请让我知道根据我的问题和数据集使用交叉验证的合适方法.

Please let me know a suitable way to use cross-validation according to my problem and dataset.

注意:我不仅限于LSTM，也很高兴探索其他模型.

NOTE: I am not limited to LSTM and happy to explore other models as well.

如果需要，我很乐意提供更多详细信息.

I am happy to provide more details if needed.

如何使用KerasClassifier将LSTM用于序列分类 [英] How to use LSTM for sequence classification using KerasClassifier

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

如何使用KerasClassifier将LSTM用于序列分类 [英] How to use LSTM for sequence classification using KerasClassifier

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭