在 DynamoDB 表上搜索数组项 [英] searching on array items on a DynamoDB table

查看:14
本文介绍了在 DynamoDB 表上搜索数组项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要了解如何搜索作为数组一部分的 DynamoDB 的属性.

I need to understand how one can search attributes of a DynamoDB that is part of an array.

因此,在对表格进行非规范化时,假设一个人有很多电子邮件地址.我会在 person 表中创建一个数组来存储电子邮件地址.

So, in denormalising a table, say a person that has many email addresses. I would create an array into the person table to store email addresses.

现在,由于电子邮件地址不是排序键的一部分,如果我需要对电子邮件地址执行搜索以查找人员记录.我需要索引 email 属性.

Now, as the email address is not part of the sort key, and if I need to perform a search on an email address to find the person record. I need to index the email attribute.

  1. 我可以在电子邮件地址上创建一个索引吗,它与个人记录是一对多的关系,并且按照我在 DynamoDB 中的理解存储为一个数组.
  2. 这个二级索引是全局的还是本地的?假设我有数十亿人的记录?
  1. Can I create an index on the email address, which is 1-many relationship with a person record and it's stored as an array as I understand it in DynamoDB.
  2. Would this secondary index be global or local? Assuming I have billions of person records?
  1. 如果我可以将其创建为 LSI 或 GS​​I,请说明每种方法的优缺点.

非常感谢!

推荐答案

Stu 的答案中有一些很好的信息,他是对的,你不能使用 Array 它本身作为键.

Stu's answer has some great information in it and he is right, you can't use an Array it's self as a key.

您可以sometimes 做的是将多个变量(或数组)连接成一个带有已知分隔符(例如_")的字符串,然后将该字符串用作排序键.

What you CAN sometimes do is concatenate several variables (or an Array) into a single string with a known seperator (maybe '_' for example), and then use that string as a Sort Key.

我使用这个概念创建了一个复合排序键,它包含多个 ISO 8061 日期对象(DyanmoDB 将日期存储为字符串类型属性中的 ISO 8061).我还使用了几个不是日期而是具有固定字符长度的整数的属性.

I used this concept to create a composite Sort Key that consisted of multiple ISO 8061 date objects (DyanmoDB stores dates as ISO 8061 in String type attributes). I also used several attributes that were not dates but were integers with a fixed character length.

通过使用 BETWEEN 比较,我可以单独查询连接到排序键中的每个变量,或者构造一个复杂的查询,将它们作为一个组匹配.

By using the BETWEEN comparison I am able to individually query each of the variables that are concatenated into the Sort Key, or construct a complex query that matches against all of them as a group.

换句话说,数据对象可以像这样使用排序键:email@gmail.com_email@msn.com_email@someotherplace.com

In other words a data object could use a Sort Key like this: email@gmail.com_email@msn.com_email@someotherplace.com

然后您可以使用以下方式查询(假设您知道分区键是什么):

Then you could query that (assuming you knew what the partition key is) with something like this:

SELECT * FROM 用户WHERE User='Bob' AND Emails LIKE '%email@msn.com%'

我认为您要问的真正问题是我的排序键和分区键应该是什么?这将取决于您要进行哪些查询以及每种查询的使用频率.

I think the real question you are asking is what should my sort keys and partition keys be? That will depend on exactly which queries you want to make and how frequently each type of query is used.

我发现,如果我先考虑要进行的查询,然后再从那里开始,我会在 DynamoDB 上取得更大的成功.

I have found that I have way more success with DynamoDB if I think about the queries I want to make first, and then go from there.

这里的问题是您仍然需要知道"辅助数据结构的分区键.GSI/LSI 可帮助您避免仅出于改进数据访问的目的而创建额外的 DynamoDB 表.

The issue here is that you still need to 'know' the Partition Key for your secondary data structure. GSI / LSI help you avoid needing to create additional DynamoDB tables for the sole purpose of improving data access.

来自亚马逊:https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/SecondaryIndexes.html

对我来说,这听起来更像是选择键的问题.

To me it sounds more like the issue is selecting the Keys.

LSI(本地二级索引)如果(对于您的查询案例)您不知道分区键开始(看起来您不知道),那么本地二级索引将无济于事 - 因为它具有与基表相同的分区键.

LSI (Local Secondary Index) If (for your Query case) you don't know the Partition Key to begin with (as it seems you don't) then a Local Secondary Index won't help — since it has the SAME Partition Key as the base table.

GSI(全球二级指数)全局二级索引可以帮助您拥有不同的分区键和排序键(大概是您可以知道"此查询的分区键).

GSI (Global Secondary Index) A Global Secondary Index could help in that you can have a DIFFERENT Partition Key and Sort Key (presumably a partition key that you could 'know' for this query).

因此,您可以使用电子邮件属性(可能是复合属性)作为 GSI 上的排序键,然后使用服务名称或注册阶段之类的东西作为分区键.这将让您根据用户的进度或他们注册的服务(例如)知道"用户将在哪个分区.

So you could use the Email attribute (perhaps composite) as the Sort Key on your GSI and then something like a service name, or sign-up stage, as your Partition Key. This would let you 'know' what partition that user would be in based on their progress or the service they signed up from (for example).

GSI/LSI 仍然需要使用它们的键生成唯一值,所以请记住这一点!

这篇关于在 DynamoDB 表上搜索数组项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆