如何有效修剪的数据 [英] How to efficiently prune data
问题描述
我目前正在为工作在这里我需要的数据和修剪根据用户定义的限制的情况下产生的一个问题。我曾尝试千头万绪,但似乎无法得到任何东西尽可能高效,因为我想运行。我可能要在数据库之外运行,所以我可以扩展运行,但认为我应该尝试如果可能的话到数据库内执行。因此,举例来说,如果我有3个实体:
I am currently working on a problem for work where I need to take data and prune the generated scenarios based on user defined restrictions. I have tried a multitude of things, but can't seem to get anything to run as efficiently as I would like. I might have to run outside of the DB so I can scale the run, but thought I should try to perform inside the DB if possible. So for instance if I have 3 entities:
Transportation Type:
Car
Boat
Plane
Color:
Blue
Green
Red
Purple
White
Accessories:
Trailer
Wheels
Propeller
Parachute
用户可以进入限制:
Transportation_Type=Boat, Accessories= Wheels
那么,你有一个场景,有船和车轮会受到限制的任意组合。
So any combination where you had a scenario that had Boat and Wheels would be restricted.
Example Valid Scenario with restriction: Boat/Red/Trailer
那么,这变得复杂的是,你可以想像,如果我建了3个实体,是不是太糟糕,即便是用户定义的限制所有可能出现的情况。但是,如果有喜欢的22实体(实体基本上是值的水平)。你可以想像,这有可能会庞大而就难以申请的限制。特别是当它是一组级/值(所以像船和车轮),使一个限制。
So where this gets complicated is that you can imagine that if I build all possible scenarios for the 3 entities that isn't too bad, even with user defined restrictions. But, what if there are like 22 entities(Entities is basically a level with values). You can imagine that this could get huge and would be difficult to apply restrictions. Especially when it is a set of Level / Values (So like the Boat and Wheels) that make up a restriction.
任何人有什么想法?
我能够通过建立动态的喜欢,我可以检查得出方案对语句来得到它,真正做到高性能通过约14-16的水平。但在这之后的处理时间爆炸(它可以在较低的水平,如果有在水平更大量的值)。
I was able to get it to be really performant through about 14-16 levels by building dynamic like statements that I could check the derived scenario against. But after that the process time explodes (which it could at lower levels if there were a lot more values in the level).
推荐答案
如果我undersand正确的,我们的目标是产生满足特定条件的情况下。该方案将来自属性的组合来产生
If I undersand correctly, the goal is to generate scenarios that meet certain criteria. The scenarios would be generated from combinations of attributes.
假设每个实体是在一个单独的表,则可以做作为查询:
Assuming that each entity is in a separate table, you could do the query as:
select *
from TransportationType tt cross join
Color c cross join
Accessories a
where tt.val in (<accepted transportation types>) and
c.val in (<accepted colors>) and
a.val in (<accepted accessories>)
如果我的理解是正确的,这将产生大量的方案作为实体数量的增加。如果你允许的情况下一个表(实体的组合),那么这将有助于过滤下来的东西。
If my understanding is correct, this will generate lots of scenarios as the number of entities increases. If you have a table of allowable scenarios (combinations of entities), then that would help filter things down.
我已经为每个实体单独的表中显示这一点,但你可以用子查询替换它们:
I've shown this with separate tables for each entity, but you can replace them with subqueries:
from (select *
from table t
where t.type = 'TransportationType'
) TransportationType cross join
...
这篇关于如何有效修剪的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!