如何使用 doc2vec 嵌入作为神经网络的输入 [英] How to use doc2vec embeddings as an input to a neural network

查看:88
本文介绍了如何使用 doc2vec 嵌入作为神经网络的输入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为项目的一部分,我正在尝试慢慢开始在 Twitter 推荐系统上工作,这需要我使用某种形式的深度学习.我的目标是根据未标记数据的推文的主题内容推荐其他推文.

I'm trying to slowly begin working on a Twitter recommender system as part of a project, which requires me to use some form of deep learning. My goal is to recommend other tweets based on the topical content of a tweet with unlabelled data.

我对数据进行了预处理,并在 doc2vec 中训练了一些模型变体,以获得词嵌入和文档嵌入.但我的问题是我对从这里去哪里感到有点迷茫.我读过 doc2vec 可以用作更深层神经网络的输入,用于训练,例如 LSTM 甚至 CNN.

I have pre-processed my data and trained a few variations of models in doc2vec to get both word embeddings and document embeddings. But my issue is that I feel a little lost with where to go from here. I've read that doc2vec can be used as an input to a deeper neural network for training such as an LSTM or even a CNN.

谁能帮我理解这些文档嵌入(和词嵌入,我在 DM 模式下训练模型)是如何用作输入的,以及在这种情况下神经网络的目的是什么,是用于聚类吗?我知道这个问题有点开放,但我对这一切都很陌生,任何帮助将不胜感激.

Could anyone help me understand how these document embeddings (and word embeddings, I trained the model on DM mode) are used as input and what the purpose of the neural net would be in this case, is it for clustering? I understand the question is a little open-ended but I'm quite new to all this, any help would be appreciated.

推荐答案

如果您已经为每个文档训练了 d 维 doc2vec,该文档将成为该特定推文的输入向量.如果你有 n 个文档,它会变成 n*d 维矩阵.现在,这个矩阵可以提供给神经网络.LSTM 和 CNN 模型都用于监督学习问题(您已标记数据).

If you have trained a d dimensional doc2vec for each document that will become the input vector for that particular tweet. If you have n number of documents, it will become n*d dimensional matrix. Now, this matrix can be given to the neural network. LSTM and CNN models are all used for supervised learning problems (where you have labeled data).

如果你没有带标签的数据,那就去无监督学习吧.聚类属于这个!您可以运行不同的聚类算法并基于此进行推荐.

If you dont have labelled data, then go for unsupervised learning. Clustering comes under this! You can run different clustering algos and recommend based on this.

这篇关于如何使用 doc2vec 嵌入作为神经网络的输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆