MySQL:索引可变长度的json数组? [英] MySQL: index json arrays of variable length?

查看:119
本文介绍了MySQL:索引可变长度的json数组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想创建一个标签 json

例如,

id  |  tags
=========================================
1   |  '["tag1", "tag2", "tag3"]'
2   |  '["tag1", "tag3", "tag5", "tag7"]'
3   |  '["tag2", "tag5"]'

我想索引每个 tag ,不知道数组的长度(可变长度)。

I want to index each tag in the arrays, without knowing the length of the arrays (variable length).

那么如果我查询包含<的行code> tag2 ,它应返回第1,3行。

So then if I query for rows that contain tag2, it should return rows 1, 3.


https://dev.mysql.com/doc/refman/5.7/en/json.html

JSON列无法编入索引。您可以通过在生成的列上创建索引来解决此限制
,该列从JSON列中提取标量

JSON columns cannot be indexed. You can work around this restriction by creating an index on a generated column that extracts a scalar value from the JSON column

通过提取标量值,这是否意味着我必须提取&单独索引数组中的每个项目(意味着我必须知道要将它们全部索引的数组的最大长度)?如何索引可变长度数组?

By "extracts a scalar value", does this mean I must extract & index each item in the arrays individually (meaning I must know the maximum length of the array to index them all)? How do I index a variable length array?

推荐答案


通过提取标量值,这意味着我必须提取&单独索引数组中的每个项目[...]?

By "extracts a scalar value", does this mean I must extract & index each item in the arrays individually [...]?

您可以根据需要提取任意数量的项目。它们将存储为标量(例如字符串),而不是复合值(JSON是)。

You can extract as many items as you want. They will be stored as scalars (e.g. string), rather than as compound values (which JSON is).

CREATE TABLE mytags (
    id INT NOT NULL AUTO_INCREMENT,
    tags JSON,
    PRIMARY KEY (id)
);

INSERT INTO mytags (tags) VALUES
    ('["tag1", "tag2", "tag3"]'),
    ('["tag1", "tag3", "tag5", "tag7"]'),
    ('["tag2", "tag5"]');

SELECT * FROM mytags;

+----+----------------------------------+
| id | tags                             |
+----+----------------------------------+
|  1 | ["tag1", "tag2", "tag3"]         |
|  2 | ["tag1", "tag3", "tag5", "tag7"] |
|  3 | ["tag2", "tag5"]                 |
+----+----------------------------------+

让我们创建一个仅包含一个项目的索引(来自JSON对象的第一个值):

Let's create an index with one item only (first value from the JSON object):

ALTER TABLE mytags
    ADD COLUMN tags_scalar VARCHAR(255) GENERATED ALWAYS AS (json_extract(tags, '$[0]')),
    ADD INDEX tags_index (tags_scalar);

SELECT * FROM mytags;

+----+----------------------------------+-------------+
| id | tags                             | tags_scalar |
+----+----------------------------------+-------------+
|  1 | ["tag1", "tag2", "tag3"]         | "tag1"      |
|  2 | ["tag1", "tag3", "tag5", "tag7"] | "tag1"      |
|  3 | ["tag2", "tag5"]                 | "tag2"      |
+----+----------------------------------+-------------+

现在你有一个VARCHAR列的索引 tags_scalar 。该值包含引号,也可以跳过:

Now you have an index on the VARCHAR column tags_scalar. The value contains quotes, which can also be skipped:

ALTER TABLE mytags DROP COLUMN tags_scalar, DROP INDEX tags_index;

ALTER TABLE mytags
    ADD COLUMN tags_scalar VARCHAR(255) GENERATED ALWAYS AS (json_unquote(json_extract(tags, '$[0]'))),
    ADD INDEX tags_index (tags_scalar);

SELECT * FROM mytags;

+----+----------------------------------+-------------+
| id | tags                             | tags_scalar |
+----+----------------------------------+-------------+
|  1 | ["tag1", "tag2", "tag3"]         | tag1        |
|  2 | ["tag1", "tag3", "tag5", "tag7"] | tag1        |
|  3 | ["tag2", "tag5"]                 | tag2        |
+----+----------------------------------+-------------+

您可以想象,生成的列可以包含更多来自JSON的项目:

As you can already imagine, the generated column can include more items from the JSON:

ALTER TABLE mytags DROP COLUMN tags_scalar, DROP INDEX tags_index;

ALTER TABLE mytags
    ADD COLUMN tags_scalar VARCHAR(255) GENERATED ALWAYS AS (json_extract(tags, '$[0]', '$[1]', '$[2]')),
    ADD INDEX tags_index (tags_scalar);

SELECT * from mytags;

+----+----------------------------------+--------------------------+
| id | tags                             | tags_scalar              |
+----+----------------------------------+--------------------------+
|  1 | ["tag1", "tag2", "tag3"]         | ["tag1", "tag2", "tag3"] |
|  2 | ["tag1", "tag3", "tag5", "tag7"] | ["tag1", "tag3", "tag5"] |
|  3 | ["tag2", "tag5"]                 | ["tag2", "tag5"]         |
+----+----------------------------------+--------------------------+

或使用任何其他有效表达从JSON结构中自动生成一个字符串,以获得可以轻松索引和搜索的内容,如tag1tag3tag5tag7。

or use any other valid expression to auto-generate a string out of the JSON structure, in order to obtain something that can be easily indexed and searched like "tag1tag3tag5tag7".


[...](意思是我必须知道要将它们全部索引的数组的最大长度)?

[...](meaning I must know the maximum length of the array to index them all)?

如上所述,您不需要知道 - 使用任何有效的表达式都可以跳过NULL值。但当然最好还是要知道。

现在有架构决定:JSON数据类型是否最适合实现目标?解决这个特殊问题? JSON是正确的工具吗?是否会加快搜索速度?

As explained above, you don't need to know - NULL values can be skipped by using any valid expression. But of course it's always better to know.
Now there's the architecture decision: Is JSON data type the most appropriate to achieve the goal? To solve this particular problem? Is JSON the right tool here? Is it going to speed up searching?


如何索引可变长度数组?

How do I index a variable length array?

如果你坚持,则施放字符串:

If you insist, cast string:

ALTER TABLE mytags DROP COLUMN tags_scalar, DROP INDEX tags_index;

ALTER TABLE mytags
    ADD COLUMN tags_scalar VARCHAR(255) GENERATED ALWAYS AS (replace(replace(replace(cast(tags as char), '"', ''), '[', ''), ']', '')),
    ADD INDEX tags_index (tags_scalar);

SELECT * from mytags;

+----+----------------------------------+------------------------+
| id | tags                             | tags_scalar            |
+----+----------------------------------+------------------------+
|  1 | ["tag1", "tag2", "tag3"]         | tag1, tag2, tag3       |
|  2 | ["tag1", "tag3", "tag5", "tag7"] | tag1, tag3, tag5, tag7 |
|  3 | ["tag2", "tag5"]                 | tag2, tag5             |
+----+----------------------------------+------------------------+

这样或那样你最终得到一个VARCHAR或TEXT列,你应用最适用的索引结构(一些选项)。

This way or another you end up with a VARCHAR or TEXT column, where you apply the most applicable index structure (some options).

进一步阅读:

  • Indexing a Generated Column to Provide a JSON Column Index
  • Functions That Search JSON Values

这篇关于MySQL:索引可变长度的json数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆