Hadoop机器学习/数据挖掘项目的想法？ [英] Hadoop Machine learning/Data mining project idea?

查看：152 发布时间：2018/5/31 19:05:19 hadoop machine-learning data-mining

本文介绍了Hadoop机器学习/数据挖掘项目的想法？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是一名研究生CS学生（数据挖掘和机器学习），并具有良好的Java核心（> 4年）的知识。我已经阅读了一堆关于Hadoop和Map / Reduce的内容

现在我想做一个关于这个东西的项目（通过我的空闲时间）来获取一个更好的理解。

任何好的项目想法都会得到真正的赞赏。我只想做这个学习，所以我不介意重新发明轮子。此外，与数据挖掘/机器学习相关的任何事情都将是额外的好处（符合我的研究结果），但绝对没有必要。

解决方案

你没有写任何关于你的兴趣的东西。
我知道图挖掘中的算法是通过hadoop框架实现的。此软件 http://www.cs.cmu.edu/~pegasus/ 和纸张：PEGASUS：一个Peta规模图挖掘系统 - 实现和观察可能会给你一个出发点。进一步，这个链接讨论类似于你的问题： http://atbrox.com/2010/ 02/08 / parallel-machine-learning-for-hadoopmapreduce-a-python-example / 但是它在python中。
并且Andrew Ng撰写了一篇非常好的论文Map-Reduce for Machine Learning on Multicore。

有关类似主题的NIPS 2009研讨会大规模机器学习：并行和大规模数据集。你可以浏览一些论文并得到一个想法。

编辑：另外还有Apache Mahout http：// mahout。 apache.org/ - >我们使用map / reduce范例在Apache Hadoop上实现了用于集群，分类和基于批处理的协作过滤的核心算法

I am a graduate CS student (Data mining and machine learning) and have a good exposure to core Java (>4 years). I have read up a bunch of stuff on Hadoop and Map/Reduce

I would now like to do a project on this stuff (over my free time of corse) to get a better understanding.

Any good project ideas would be really appreciated. I just wanna do this to learn, so I dont really mind re-inventing the wheel. Also, anything related to data mining/machine learning would be an added bonus (fits with my research) but absolutely not necessary.
解决方案
You haven't written anything about your interest. I know algorithms in graph mining has been implemented over hadoop framework. This software http://www.cs.cmu.edu/~pegasus/ and paper : "PEGASUS: A Peta-Scale Graph Mining System - Implementation and Observations" may give you starting point.

Further, this link discusses something similar to your question: http://atbrox.com/2010/02/08/parallel-machine-learning-for-hadoopmapreduce-a-python-example/ but it is in python. And, there is a very good paper by Andrew Ng "Map-Reduce for Machine Learning on Multicore".

There was a NIPS 2009 workshop on similar topic "Large-Scale Machine Learning: Parallelism and Massive Datasets". You can browse some of the paper and get an idea.

Edit : Also there is Apache Mahout http://mahout.apache.org/ -->" Our core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm"

这篇关于Hadoop机器学习/数据挖掘项目的想法？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Hadoop机器学习/数据挖掘项目的想法？ [英] Hadoop Machine learning/Data mining project idea?

问题描述

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

Hadoop机器学习/数据挖掘项目的想法？ [英] Hadoop Machine learning/Data mining project idea?

问题描述

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭