具有多个标签的DynamoDB查询 [英] DynamoDB Query with multiple tags

查看:112
本文介绍了具有多个标签的DynamoDB查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对DynamoDB并不陌生,目前我们正在考虑使用DynamoDB将现有项目迁移到无服务器应用程序,我们希望从RDMS数据库中适应​​以下设置:

I am rather new to DynamoDB and currently we are thinking about migrating an existing project to a serverless application using DynamoDB where we want to adapt the following setup from a RDMS database:

表:


  • 项目( ProjectID

  • 文件(文件ID 项目ID ,文件名)

  • 标签(文件ID ,标签)

  • Projects (ProjectID)
  • Files (FileID, ProjectID, Filename)
  • Tags (FileID, Tag)

我们要使用DynamoDB进行查询,以获取特定项目的所有文件 ProjectID),其中包含一个或多个标签(按标签)。在RDMS中,此查询将很简单,例如:

We want to make a query with DynamoDB to fetch all Files for a specific Project (by ProjectID) with one or multiple Tags (by Tag). In an RDMS this query would be simple with something like:

SELECT * FROM Files JOIN标签上Tags.FileID = Files.FileID WHERE文件。 ProjectID =?PROJECT AND Tags.Tag =?TAG_1或?TAG_2 ...

目前,我们具有以下DynamoDB设置(但仍然可以更改):

At the moment, we have the following DynamoDB setup (but it can still be changed):


  • 项目(ProjectID [HashKey],...)

  • 文件(ProjectID [HashKey],FileID [RangeKey],...)

请同时考虑项目条目数是巨大的(介于1000-30000之间),而且每个项目的文件数量(介于50和100.000之间),查询应该非常快。

Please also consider that the number of project entries is huge (between 1000 - 30000) and also the number of files for each project (is between 50 and 100.000) and the query should be really fast.

这怎么办使用DynamoDB查询来实现,最好不使用过滤器表达式,因为它们是在数据选择之后应用的?如果表文件可以具有StringSet标签​​作为列,那将是完美的,但是我想这不能用于有效的DynamoDB查询(因此,如果不使用DynamoDB-scan),因为DynamoDB索引只能是String,Binary和Number类型,而不是StringSet类型?这可能是全球二级索引的适用用例吗? (GSI)?

How can this be achieved using DynamoDB-query, best without using filter expressions since they are applied after data selection? It would be perfect if the table Files could have a StringSet Tags as column but I guess that this cannot be used for an efficient DynamoDB-query (so without using DynamoDB-scan) since DynamoDB-indices can only be of type String, Binary and Number and not of type StringSet? Is this maybe an applicable use case for the Global Secondary Index (GSI)?

推荐答案

有点晚了,刚刚看到这个问题是从另一个问题引用的。

A bit late, just saw this question referenced from another one.

我想您已经解决了这个问题?

I guess you've went and solved it something like this?

DynamoDB表


  • 项目(ProjectID [HashKey],...)

  • 文件(ProjectID [HashKey],FileID [RangeKey],...)

  • 标签(标签[HashKey],FileID [RangeKey],ProjectID [LSI Sort Key])

在FileTag上,您需要FileID来使主键唯一,但是您可以将ProjectID添加为本地二级索引的排序键,以便您可以搜索Tag + ProjectID。

On the FileTags, you need the FileID to make the primary key unique, but you can add the ProjectID as a sort key for a Local Secondary Index, so you can search on Tag + ProjectID.

这是某种数据非规范化,但这就是NoSQL所需要的:-(。。例如,如果您的文件将切换到另一个项目,则需要更新不仅要在文件上,还要在所有标签上都使用ProjectID。

It's some sort of Data Denormalization, but that's what it takes to go NoSQL :-( . E.g. if your File would be switched to another Project, you'll need to update the ProjectID not only on the File, but also on all the Tags.

这篇关于具有多个标签的DynamoDB查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆