Postgres UPDATE to_tsvector将所有行更新为相同的值 [英] Postgres UPDATE to_tsvector updating all rows to same value
问题描述
我想设置 to_tsvector
语言(例如:法语),以便在渲染FTS矢量时使用正确的字典。
I want to set the to_tsvector
language (e.g.: 'French') so it uses the proper dictionary when rendering the FTS vector.
表消息的 locale_id
列位于 locales
表。然后,我需要将 locales
表联接到locale_id的 languages
表中以获得实际的语言名称。
Table messages has a locale_id
column, which is on the locales
table. Which I then need to join the locales
table to the languages
table on locale_id to get the actual language name.
此UPDATE应该循环遍历消息
中的所有行并设置 vector
列= to_tsvector(连接的语言名称,消息)
,但它会将每一行更新为相同的值和相同的语言字典(例如: to_tsvector('french',stringX)
。
This UPDATE is supposed to loop through all rows in messages
and set the vector
column = to_tsvector(joined language name, message)
, yet it updated each row to the same value and same language dictionary (e.g.: to_tsvector('french', stringX)
.
这是为什么呢?每一行都有不同的消息
字符串和不同的 locale_id
(含义,不同的语言名称)。
Why is this? Each row has a different message
string and a different locale_id
(meaning, different language name).
只需忽略pg_dictionary的语言配置即可:
So if I simply disregard the language config for pg_dictionary and do:
UPDATE messages
SET vector = to_tsvector(message);
表结果:
消息:
message | locale_id | vector
-----------------------------
Hi there | 1 | 'hi':1
Is a test | 2 | 'test':3
Le french | 3 | 'french':2 'le':1 --'le' SHOULD BE omitted since it's a stop word in French pg_dictionary
这很好。显然,它没有为每一行加载正确的语言命令。但是,执行以下操作会对每一行产生相同的结果:
This works fine. Obviously, it's not loading in the correct language dict for each row. However, doing the following yields the same result for each row:
UPDATE messages
SET vector = to_tsvector(messages_languages.language::regconfig, messages_languages.message)
FROM (
select t3.language, t1.message
from messages as t1
inner join locales as t2 on (t1.locale_id = t2.id)
inner join languages as t3 on (t2.language_id = t3.id)
) messages_languages;
我也尝试过使用 ,结果相同:
I've also tried it with a WITH
, same results:
WITH messages_languages as (
select t3.language, t1.message
from messages as t1
inner join locales as t2 on (t1.locale_id = t2.id)
inner join languages as t3 on (t2.language_id = t3.id)
)
UPDATE messages
SET vector = to_tsvector(messages_languages.language::regconfig, messages_languages.message)
FROM messages_languages;
表结果:
消息:
message | locale_id | vector
-----------------------------
Hi there | 1 | 'french':2
Is a test | 2 | 'french':2
Le french | 3 | 'french':2 --'le' omitted correctly in french pg_dictionary as it's a STOP word
'法语,因为pg_dictionary_name =法语,应该是此表中唯一的法语:2矢量结果,但所有行都相同
'french', for pg_dictionary_name = 'French', should be the only 'french':2 vector result in this table, yet all rows are the same
Locales:
Locales:
id | language_id
------------------
1 | 4
2 | 5
3 | 6
语言:
id | language
------------------
4 | 'English'
5 | 'German'
6 | 'French'
推荐答案
- you不需要子查询
- 您不需要重新选择消息,(目标表已经在范围表中)
- 您需要关联结果行的源查询
UPDATE messages msg
SET vector = to_tsvector(lang.language::regconfig, msg.message)
FROM locales as loco
JOIN languages as lang ON loco.language_id = lang.id
WHERE msg.locale_id = loco.id
;
这篇关于Postgres UPDATE to_tsvector将所有行更新为相同的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!