Lucene.Net 索引更新,当在 SQL 数据库中进行手动更改时 [英] Lucene.Net index updates, when a manual change is done in SQL Database

查看:36
本文介绍了Lucene.Net 索引更新,当在 SQL 数据库中进行手动更改时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Lucene.Net的新手,目前正在进行研发,以将其用于.Net应用程序.由于Lucene.Net是通用库,与SQL Server,SQLite等数据源无关.它仅知道您有要索引的Lucene文档.因此,当我们从任何数据源将数据转储到Lucene.Net时.我们如何使Lucene.Net文档保持最新状态,因为数据位于SQL数据库中(例如).保持两个数据(即Lucene.Net和SQL)同步的一种方法是在每次数据库更新期间不断更新Lucene索引.我们还知道,有人可能会手动更改SQL数据库,在这种情况下,我们如何更新Lucene索引?

I am new on Lucene.Net and currently doing R&D to use this for .Net applications. As Lucene.Net is a general purpose library and it has nothing to do with data sources like SQL Server, SQLite, etc. It only knows you have a Lucene document that you want indexed. So when we dump data to Lucene.Net from any data source. How can we make Lucene.Net documents up to date as the data is in SQL Database(For example). One way to keep both data, i.e. (Lucene.Net and SQL) sync is to continually update the Lucene index during each database update. We also know that there is a possibility that someone can made manually changes to SQL database, in that scenario how we can update Lucene indexes?

推荐答案

我可以提供有关如何执行此操作的概念性概述.从根本上讲,您需要三件事.

I can provide a conceptual overview of how to do this. Fundamentally you need three things.

  1. 一种每次知道sql数据库中的相关数据发生变化的方法
  2. 捕获有关该更改的信息的地方,称为更改日志.
  3. 一个例程,该例程读取更改日志,将这些更改应用于LuceneNet索引,然后标记更改日志中的记录已处理.

当然,有很多不同的方法来处理这些问题.

There are of course lots of different ways to handle each of these.

处理#1的最简单方法是数据库是否支持插入,更新和删除触发器.如果是这样,那么您可以在向 LuceneNet 索引提供数据的每个表上添加这三个触发器,并且当这些表中的一个表中的记录发生更改时,触发器可以自动将一条记录写入更改日志,指示表、记录 ID操作(插入,更新,删除).如果您的数据库不支持触发器,那么会有点困难.您可以连接到一些常见的api,供您的应用在进行插入,更新和删除时用于与数据库进行通讯,并让该钩子在更改日志中记录相同类型的信息.

The easiest way to handle #1 is if your database supports insert, update and delete triggers. If it does then you can add these three triggers on every table that supplies data to the LuceneNet index and when a record in one of those tables changes the trigger can automatically write a record into the change log that indicate the table, record id and the operation (insert, update, delete). If your database does not support triggers then it's a bit harder. You could hook into some common api that your app uses to talk to the database when doing an insert, update, and delete and have that hook record the same sort of info in a change log.

更改日志可以采用多种形式,但是最简单的方法可能只是在sql数据库中创建一个表.通过这种方式,插入,更新和删除触发器可以通过将记录插入changeLog表中来直接记录其观察结果.如果您是从api包装器写入数据,则将其显示为sql数据库表也可以.

The change log can take many forms, but the easiest way is probably to just create a table in the sql database. This way the insert, update and delete triggers can record their observations directly by inserting a record into the changeLog table. Having it manifest as a sql database table also works if you are writing to it from an api wrapper.

有很多方法可以实现这一点,但最可靠的方法可能是使用计时器启动后台线程,每隔几秒检查一次未处理的 changeLog 记录的存在.如果找到此类记录,则将它们读入,检查它是否用于插入,更新或删除操作,以及用于哪个表和记录ID.如果插入或更新,它将从sql数据库读取记录,并在LuceneNet中插入或更新rec.如果要删除,它将直接删除LuceneNet中的记录.然后,它在changeLog记录上设置一个布尔值,以指示该记录已被处理.

There are alot of ways to implement this, but probably the most robust is to use a timer to kick off a background thread that checks for the presence of unprocessed changeLog records every so many seconds. If it finds such records, it reads them in, checks whether it's for an insert, update or delete operation and for which table and record ID. If insert or update, it reads the records from the sql database and inserts or updates the rec in LuceneNet. If for a deleted it directly deletes the record in LuceneNet. Then it sets a boolean on the changeLog record to indicate that the record has been processed.

可以添加更多的花哨功能,但这应该使您清楚地了解如何实现一种使LuceneNet索引近乎实时更新的方法.

There are more bells and whistles that can be added, but that should give you a pretty clear picture of how to implement a way to keep the LuceneNet index up to date in near real time.

这篇关于Lucene.Net 索引更新,当在 SQL 数据库中进行手动更改时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆