PHP Lucene-索引-2.000.000系统块后在Linux中失败 [英] PHP Lucene - Indexation - Fails in Linux after 2.000.000 system blocks
问题描述
我一直在使用Zend Framework最新版本创建索引.界面工作正常,其他一切正常.
我现在遇到的问题是重新索引"或索引的创建.我检查了所有其他内容,清理了数据并再次检查了数据的质量.
I have been working on creating an index using Zend Framework latest version. The interface is working fine and everything else.
The problem I have now is the "re-indexation" or creation of the index. I have checked everything else, sanitizing the data and double checking the quality of the data.
该过程总是在最有可能的记录15.000和索引目录限制2.000.000处停止.我决定构建一个使用lucene3.0.3版本在Java中编译的应用程序来运行索引.
The Process always stops at most likely record 15.000 and the limit on the index dir of 2.000.000. That I decided to build an application compiled in java with version lucene3.0.3 to run the indexation.
Fatal error: Uncaught exception 'Zend_Search_Lucene_Exception' with message 'Unsupported segments file format' in
似乎Zend Lucene使用的最新格式是2.3
关于如何解决此问题的任何想法,我非常感谢您的投入
It seems the latest format used by Zend Lucene is 2.3
Any ideas how to solve this problem, I really appreciate your input
推荐答案
我自定义了此网站的示例 http://www.techcrony.info/?p=33 ,此示例从数据目录读取文本文件.因此,新的自定义函数需要从MySQL数据库读取信息:
I customized the example of this site http://www.techcrony.info/?p=33, this example reads text files from a data dir. So, the new customized functions need to read the info from the MySQL database:
public static void main(String[] args) throws Exception
{....System.out.print("Index dir arg_0 : " + indexDir + "\r");
String id ="%";
long start = new Date().getTime();
int numIndexed = index_main(indexDir, id);
long end = new Date().getTime();
System.out.print("End Program... \r");
}
private static int index_main(File indexDir, String id )throws IOException {
int numIndexed = 0;
try{
IndexWriter writer =
new IndexWriter(indexDir, new StandardAnalyzer(), true);
writer.setUseCompoundFile(false);
java.sql.Connection conn = linktodata();
int rowCount = 0;
...
如您所见,我使用了lucene-core-2.3.0.jar
As you can see I used the lucene-core-2.3.0.jar
javac -cp .:lucene-core-2.3.0.jar:mysql-connector-java-5.1.16-bin.jar Indexer.java
Run:
java -cp .:lucene-core-2.3.0.jar:mysql-connector-java-5.1.16-bin.jar Indexer /home/public_html/index_main
现在最重要的问题是,有人知道PHP lucene是否能够管理超过1.000.000个文档?
Now the most important question, is anyone aware if PHP lucene is able to manage more than 1.000.000 documents?
这篇关于PHP Lucene-索引-2.000.000系统块后在Linux中失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!