基于神经网络的文档排名 [英] Neural Network based ranking of documents

查看:140
本文介绍了基于神经网络的文档排名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正计划实现一个使用神经网络的文档排名器.如何通过考虑类似文章的评级来对文档进行评级?有什么好的python库可以执行此操作吗?谁能推荐一本用python代码编写的关于AI的好书.

编辑

我计划制作一个推荐引擎,该引擎将向相似的用户提出推荐,并使用通过标签聚类的数据.用户将有机会对文章进行投票.大约有十万篇文章.文档将基于其标签进行聚类.给定关键字,文章将根据其标签获取,并通过神经网络进行排名.

解决方案

您要解决的问题称为协作过滤".

神经网络

一种最先进的神经网络方法是 Deep Belief Networks Restricted Boltzman Machines .有关GPU(CUDA)的快速python实现,请参见 PyBrain .

关于您的特定问题的学术论文:

  • 这可能是神经网络和(电影)协作过滤的最新技术:

    Salakhutdinov,R.,Mnih,A. Hinton,G,限制玻尔兹曼 协同过滤机器,出现在 第24届国际会议论文集 机器学习2007. PDF

  • 用Python实现的Hopfield网络:

    Huang,Z.和Chen,H. and Zeng,D.在协作过滤中应用关联检索技术缓解稀疏性问题. ACM信息系统交易(TOIS),22,1,116--142,2004,ACM. PDF

  • 关于使用受限Boltzman机器进行协作过滤的论文(他们说Python在这项工作中不切实际):

    G.卢普协作过滤:可扩展 使用受限玻尔兹曼机的方法. 硕士学位,列日大学,2010年.
    PDF

神经网络目前并不是协作过滤的最新技术.而且它们不是最简单,广泛使用的解决方案.关于您对使用NN的数据太少的原因的评论,在这种情况下,神经网络没有固有的优势/劣势.因此,您可能需要考虑更简单的机器学习方法.

其他机器学习技术

当今最好的方法结合了 k最近邻居矩阵分解.

如果您被Python锁定,请查看 pysuggest (Python包装器(用于SUGGEST建议引擎)和 PyRSVD (主要针对协作过滤中的应用Netflix竞赛).

如果您愿意尝试其他开源技术,请访问:开源协作过滤框架 http://www.infoanarchy.org/en/Collaborative_Filtering .

I'm planning of implementing a document ranker which uses neural networks. How can one rate a document by taking in to consideration the ratings of similar articles?. Any good python libraries for doing this?. Can anyone recommend a good book for AI, with python code.

EDIT

I'm planning to make a recommendation engine which would make recommendations from similar users as well as using the data clustered using tags. User would be given chance to vote for articles. There will be about hundred thousand articles. Documents would be clustered based on their tags. Given a keyword articles would be fetched based on their tags and passed through a neural network for ranking.

解决方案

The problem you are trying to solve is called "collaborative filtering".

Neural Networks

One state-of-the-art neural network method is Deep Belief Networks and Restricted Boltzman Machines. For a fast python implementation for a GPU (CUDA) see here. Another option is PyBrain.

Academic papers on your specific problem:

  • This is probably the state-of-the-art of neural networks and collaborative filtering (of movies):

    Salakhutdinov, R., Mnih, A. Hinton, G, Restricted Boltzman Machines for Collaborative Filtering, To appear in Proceedings of the 24th International Conference on Machine Learning 2007. PDF

  • A Hopfield network implemented in Python:

    Huang, Z. and Chen, H. and Zeng, D. Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering. ACM Transactions on Information Systems (TOIS), 22, 1,116--142, 2004, ACM. PDF

  • A thesis on collaborative filtering with Restricted Boltzman Machines (they say Python is not practical for the job):

    G. Louppe. Collaborative filtering: Scalable approaches using restricted Boltzmann machines. Master's thesis, Universite de Liege, 2010.
    PDF

Neural networks are not currently the state-of-the-art in collaborative filtering. And they are not the simplest, wide-spread solutions. Regarding your comment about the reason for using NNs being having too little data, neural networks don't have an inherent advantage/disadvantage in that case. Therefore, you might want to consider simpler Machine Learning approaches.

Other Machine Learning Techniques

The best methods today mix k-Nearest Neighbors and Matrix Factorization.

If you are locked on Python, take a look at pysuggest (a Python wrapper for the SUGGEST recommendation engine) and PyRSVD (primarily aimed at applications in collaborative filtering, in particular the Netflix competition).

If you are open to try other open source technologies look at: Open Source collaborative filtering frameworks and http://www.infoanarchy.org/en/Collaborative_Filtering.

这篇关于基于神经网络的文档排名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆