检索具有多个相关行的行的首选方式 [英] Preferred way of retrieving row with multiple relating rows

查看:31
本文介绍了检索具有多个相关行的行的首选方式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在使用 SqlDataReader 和存储过程在 C# 中手写一个 DAL.性能很重要,但它仍然应该是可维护的...

I'm currently hand-writing a DAL in C# with SqlDataReader and stored procedures. Performance is important, but it still should be maintainable...

假设有一个表食谱

(recipeID, author, timeNeeded, yummyFactor, ...)

和一桌配料

(recipeID, name, amount, yummyContributionFactor, ...)

现在我想查询 200 份食谱及其成分.我看到以下可能性:

Now I'd like to query like 200 recipes with their ingredients. I see the following possibilities:

  • 查询所有食谱,然后查询每个食谱的成分.
    这当然会导致 maaany 查询.
  • 在一个大的连接列表中查询所有食谱它们的成分.这会造成大量无用的流量,因为每个配方数据都会被多次传输.
  • 查询所有食谱,然后通过将 recipeID 列表传递回数据库来一次查询所有成分.或者,同时发出两个查询并返回多个结果集.返回 DAL,通过配方 ID 将成分与配方相关联.
  • 异国情调的方式:光标浏览所有食谱并返回每个食谱的两个单独的食谱和成分结果集.结果集有限制吗?
  • Query all recipes, then query the ingredients for each recipe.
    This would of course result in maaany queries.
  • Query all recipes and their ingredients in a big joined list. This will cause a lot of useless traffic, because every recipe data will be transmitted multiple times.
  • Query all recipes, then query all the ingredients at once by passing the list of recipeIDs back to the database. Alternatively issue both queries at one and return multiple resultsets. Back in the DAL, associate the ingredients to the recipes by their recipeID.
  • Exotic way: Cursor though all recipes and return for each recipe two separate resultsets for recipe and ingredients. Is there a limit for resultsets?

要获得更多种类,可以通过 DAL 中的 ID 列表或一些参数化的 SQL 条件来选择配方.

For more variety, the recipes can be selected by a list of IDs from the DAL or by some parametrized SQL condition.

您认为哪一个具有最佳的性能/混乱比?

Which one you think has the best performance/mess ratio?

推荐答案

如果你只需要连接两个表并且一个成分"不是大量的数据,那么性能和可维护性的最佳平衡可能是单个连接查询.是的,您在结果中重复了一些数据,但除非您有 100,000 行并且它使数据库服务器/网络过载,否则进行优化还为时过早.

If you only need to join two tables and an "ingredient" isn't a huge amount of data, the best balance of performance and maintainability is likely to be a single joined query. Yes, you are repeating some data in the results, but unless you have 100,000 rows and it's overloading the database server/network, it's too soon to be optimizing.

如果您有多个连接层,每个连接层的基数递减,情况就会有所不同.例如,在我的一个应用程序中,我有如下内容:

The story is a little bit different if you have many layers of joins each with decreasing cardinality. For example, in one of my apps I have something like the following:

Event -> EventType -> EventCategory
                   -> EventPriority
                   -> EventSource   -> EventSourceType -> Vendor

这样的查询会导致大量的重复,这在有 10 万个事件要检索、1000 个事件类型、10 个类别/优先级、50 个来源和 5 个供应商时是不可接受的.所以在这种情况下,我有一个返回多个结果集的存储过程:

A query like this results in a significant amount of duplication which is unacceptable when there are 100k events to retrieve, 1000 event types, maybe 10 categories/priorities, 50 sources, and 5 vendors. So in that case, I have a stored procedure that returns multiple result sets:

  • 只有 EventTypeID 的所有 100k 事件
  • 具有适用于这些事件的 CategoryID、PriorityID 等的 1000 个 EventTypes
  • 适用于上述 EventType 的 10 个 EventCategories 和 EventPriorities
  • 产生 10 万个事件的 50 个 EventSources
  • 依此类推,你明白了.

由于基数急剧下降,因此仅下载此处需要的内容并在客户端使用一些字典将其拼凑在一起(如果甚至有必要)要快得多.在某些情况下,低基数数据甚至可能会缓存在内存中,并且根本不会从数据库中检索到(除非在应用启动时或数据发生更改时).

Because the cardinality goes down so drastically, it is much quicker to download only what is needed here and use a few dictionaries on the client side to piece it together (if that is even necessary). In some cases the low-cardinality data may even be cached in memory and never retrieved from the database at all (except on app start or when the data is changed).

使用这种方法的决定性因素是非常多的结果连接基数的急剧下降,换句话说,扇入.这实际上与大多数用法相反,可能与您在这里所做的相反.如果您选择食谱"并加入配料",您可能会散开,这会使这种方法变得浪费,尤其是在只有两个表要加入的情况下.

The determining factors in using an approach such as this are a very high number of results and a steep decrease in cardinality for the joins, in other words fanning in. This is actually the reverse of most usages and probably the reverse of what you are doing here. If you are selecting "recipes" and joining to "ingredients", you are probably fanning out, which can make this approach wasteful, especially if there are only two tables to join.

所以我只是说这是一个可能的替代方案如果性能在未来成为一个问题;在您设计的这一点上,在您获得真实世界的性能数据之前,我会简单地走使用单个连接结果集的路线.

So I'm just putting it out there that this is a possible alternative if performance becomes an issue down the road; at this point in your design, before you have real-world performance data, I would simply go the route of using a single joined result set.

这篇关于检索具有多个相关行的行的首选方式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆