将 MySQL 连接到 Apache nutch [英] Connecting MySQL to Apache nutch

查看:48
本文介绍了将 MySQL 连接到 Apache nutch的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我第一次使用 Apache Nutch.爬取后如何将数据存储到MySQL数据库中?我希望能够轻松地在其他网络应用程序中使用这些数据.

I am using Apache Nutch first time. How can I store data into a MySQL database after crawling? I want to be able to easily use the data in other web applications.

我发现了一个问题相关,但我不清楚代码id的哪一部分将被 MySQL 连接器替换.请帮忙提供一个简短的代码示例.

I found a question related, but I don't clearly understand which part of the code id gona replace by MySQL connector. Please help with a short code example.

推荐答案

http://mirror.nyi.net/apache//nutch/apache-nutch-1.2-src.zip

在编辑器中打开 org.apache.nutch.crawl.Crawl 类.

Open org.apache.nutch.crawl.Crawl class in your editor.

查找变量Path crawlDb = new Path(dir + "/crawldb");

该变量将提示在何处替换代码以获得您自己的 CustomMySQLCrawl 类.

The variable will give a hint on where to replace the code in order to get your own CustomMySQLCrawl class.

在此调用期间发生持久性:crawlDbTool.update(crawlDb, segs, true, true);//更新crawldb 所以你应该把它保存到数据库中.此时您可能需要考虑集成 hibernate.

The persistence is happening during this call: crawlDbTool.update(crawlDb, segs, true, true); // update crawldb So there is where you should save it to the database. You might want to consider integrating hibernate at this point.

这篇关于将 MySQL 连接到 Apache nutch的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆