对遗留数据的 Sphinx 进行索引会产生一些错误 [英] Indexing for Sphinx of legacy data generating some errors

查看:46
本文介绍了对遗留数据的 Sphinx 进行索引会产生一些错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

正在创建 rails 3.2.18 应用程序,从 rails 2.3.10 应用程序迁移数据.数据正在通过 pg_dump 移植并通过 psql 命令加载,没有任何错误.

A rails 3.2.18 application is being created, migrating data from a rails 2.3.10 application. Data is being ported over via pg_dump and loaded via psql command, without any errors.

通过 thinking_sphinx 索引的 13 个模型中的一个出现了一些错误.8.5 个文档中只有 1 个被整体编入索引.

One model of the 13 that are indexed via thinking_sphinx is getting some errors. Only 1 in 8.5 docs is being indexed overall.

indexing index 'norm_core'...
ERROR: index 'norm_core': sql_range_query: ERROR:  integer out of range
 (DSN=pgsql://jerdvo:***@localhost:5432/fna_development).
total 1019 docs, 234688 bytes

索引文件是

ThinkingSphinx::Index.define :norm, :with => :active_record do
    indexes data
    indexes titolo
    indexes massima
    indexes numero
    indexes norm_fulltext
    indexes region.name, :as => :region
    indexes normtype.name, :as => :normtype

    has region_id
    has normtype_id
    has data, :as => :data_timestamp
end

我不确定带有 data_timestamp 的最后一个元素的语法,因为它可能是遗留语法...它适用于日期字段 - 来自架构:

I'm unsure about the syntax of the last element with data_timestamp, as it could be legacy syntax... It applies to a date field - from schema:

    t.date     "data"

其他模型在日期上具有相同的索引方案.但没有人产生错误.
[假设该行必须更改,是否应该在索引或重建之前先rake ts:configure?]

Other models have the same indexing scenario on a date. But none have generated the error.
[assuming that line has to change, should one first doe rake ts:configure before index or rebuild?]

推荐答案

调试这个的两个技巧:

  • 注释掉所有属性(has 调用),运行 ts:index 任务,确认它有效.然后一次引入每个属性,看看是哪个属性导致了错误.
  • 检查任何不起作用的属性列的最大值(例如 SELECT MAX(data) FROM norms),看看该数据是否有效并且也在 32 位的范围内无符号整数.
  • Comment out all of the attributes (the has calls), run the ts:index task, confirm it works. Then introduce each attribute back in one at a time, see which one is causing the error.
  • Check the maximum values of any attribute columns that don't work (e.g. SELECT MAX(data) FROM norms), see if that data is valid and also within the range of a 32-bit unsigned integer.

如果它是涉足 64 位 int 领域的外键之一,那么您可以将其指定为数据类型:

If it's one of the foreign keys that's ventured into 64-bit int territory, then you can specify that as the data type:

has normtype_id, :type => :bigint

如果是日期列,那么您需要通过将以下内容添加到 config/thinking_sphinx.yml 中的每个必要环境来通知 Thinking Sphinx 将日期/时间值转换为 64 位整数时间戳代码>:

If it's the date column, then you'll need to inform Thinking Sphinx to translate date/time values to be 64-bit integer timestamps by adding the following to each necessary environment in config/thinking_sphinx.yml:

development:
  64bit_timestamps: true

我猜这个问题的第三个来源是主键大于 32 位整数,但 TS 应该检测 bigint 列并适当处理文档 ID.当然,Sphinx 也需要编译来处理 64 位文档 ID,但我希望这是默认值(编译标志,作为参考,是 --enable-id64).

A third source of the issue, I guess, is the primary key being bigger than a 32-bit integer, but TS should detect bigint columns and handle document ids appropriately. Of course, Sphinx also needs to be compiled to handle 64-bit document ids, but I would expect this to be the default (the compile flag, for reference's sake, is --enable-id64).

如果这些都没有帮助……那么,我不知道原因可能是什么.

And if none of that helps... then, well, I'm at a loss to what the cause may be.

这篇关于对遗留数据的 Sphinx 进行索引会产生一些错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆