根据字段获取大型对象列表的最有效组合 [英] Get the most efficient combination of a large List of objects based on a field

查看:58
本文介绍了根据字段获取大型对象列表的最有效组合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在特定的预算和组合的最大限制下,我正在寻求最大数量的恒星。


示例问题:


预算为500欧元,只参观允许的最大餐厅或以下餐厅,用餐并收集尽可能多的星星。


I '我正在寻找一种有效的算法,可以处理多达100个最大餐厅的100万个餐厅实例。


注意,这是我昨天问的一个交叉帖子:
Java:根据字段获取大型对象列表的最有效组合


以下解决方案将为每颗星分配15 $到 r8 餐厅,这意味着生成列表时,它会首先将其放入列表,而剩下的70 $只能再获得2星, 4星。但是,如果它足够聪明,可以跳过 r8 餐厅(即使这是每星级最高的美元),那么 r1 餐厅实际上是预算的更好选择,因为它的价格为100美元,并且是5星。


有人可以帮忙解决问题并超越当前的解决方案吗?

 导入itertools 

class Restaurant():
def __init __(self,cost,stars):
self.cost =成本
self.stars =星星
self.ratio =成本/星星

def display(self):
print( Cost:$ + str(self.cost))
print( Stars: + str(self.stars))
print()

r1 = Restaurant(100,5)
r2 =餐馆(140,3)
r3 =餐馆(90,4)
r4 =餐馆(140,3)
r5 =餐馆(120,4)
r6 =餐厅(60,1)
r7 =餐厅(40,1)
r8 =餐厅(30,2)
r9 =餐厅(70,2)
r10 =餐厅(250,5)

print()
print( **** ***********)
打印( **未分类:**)
打印( *********** ****)
print()

餐馆= [r1,r2,r3,r4,r5,r6,r7,r8,r9,r10]

在餐厅中的餐厅:
打印(restaurant.ratio,restaurant.stars)

打印()
打印( ********* ******)
打印( **排序:**)
打印( *************** ;)
print()

sorted_restaurants = sorted(餐厅,键= lambda x:x.ratio,reverse = True)

对于sorted_restaurants中的餐厅: b $ b打印(restaurant.ratio,restaurant.stars)

print()
打印( ****************** ***)
print( **开始帆布背包:**)
print( ****************** ***)
print()

max = 5
预算= 100

花费= 0
数量= 0

背包= []

for it in itertools.count():如果len(背包)> = max或i == len,则

(sorted_restaurants):
打破

sorted_restaurants [i] .display()

如果sorted_restaurants [i] .cost +支出< =预算:
支出=支出+ sorted_restaurants [i ] .cost
rucksack.append(sorted_restaurants [i])

print( Total Cost:$; + str(sum([x.cost for x inackackack]])))
print( Total Stars: + str(sum([x.stars for x inackackack]))))

print()
print( *****************)
print( **最终列表:* *)
print( *****************)
print()

在背包:
restaurant.display()


解决方案

听起来您的问题与背包问题几乎相同:在一定重量和体积约束下最大化价值。基本上,价值=星星总数,重量=价格,背包极限=总预算。现在,总项目(餐厅访问)有了一个额外的约束,但这并没有改变要点。



您可能会或可能不知道,背包问题是NP hard,这意味着不知道具有多项式时间缩放的算法。



但是,可能存在使用动态规划的高效伪多项式算法,当然,也存在高效的启发式算法,例如作为您似乎发现的贪婪启发式方法。这种启发式方法涉及首先开始填充最高的密度项目(每块钱最多的星星)。如您所见,这种启发式方法在某些情况下无法找到真正的最优值。



动态编程方法在这里应该非常好。它基于递归:给定预算B和剩余访问次数V,在总共一组餐厅R中,最适合参观的餐厅是哪个?



请参阅此处: https://en.wikipedia.org/wiki/Knapsack_problem#0/1_knapsack_problem



基本上,我们为最大星数定义一个数组 m ,其中
m [i,b,v] 是允许我们访问的餐厅数量(包括餐厅)小于code>时可获得的最大星级i ,最多消费 b ,并且最多访问 v 个餐厅(上限)



现在,我们自下而上地填充此数组。例如,对于 b 和<的所有值,
m [0,b,v] = 0 code> v ,因为如果我们不能去任何餐馆,我们就不会获得任何星级。



此外, m [i,b,0] = 0 对于所有 i b ,因为如果我们用光了所有的访问,就再也看不到星星了。



下一行也不太难:



m [i,b,v] = m [i-1,b,v],如果p [i]> b
其中, p [i] 是在餐厅 i 。这行说什么?好吧,如果餐厅 i 比我们剩下的钱( b )贵,那么我们就不能去那里。这意味着无论我们包括不超过 i 的餐厅还是不超过 i-1



下一行有点棘手:



m [i ,b,v] = max(m [i-1,b,v]),m [i-1,b-p [i],v-1] + s [i]),如果p [i] <1。 = b



Phew。 s [i] 是您从餐厅 i btw获得的星数。



这行怎么说?这是动态编程方法的核心。考虑到最大星数时,我们查看不超过 i 的餐厅时可获得的星数,那么在生成的解决方案中,我们要么去那里,要么不走,我们只是必须看看这两条路径中的哪一条会带来更多的星星:



如果我们不去餐厅 i ,那么我们保留相同的金额和剩余访问量。我们在这条路径上可获得的最大星级数量与我们甚至没有看餐厅 i 的情况相同。这是 max 的第一部分。



但是如果我们确实去餐馆 i ,那么我们剩下的是 p [i] 少钱,少拜访一次,多[code> s [i] 颗星。这是 max 的第二部分。



现在的问题很简单:两者中哪个更大。 / p>

您可以创建此数组,并使用相对简单的for循环填充它(从Wiki中汲取灵感)。但是,这只是为您提供星级数量,而不是实际要访问的餐馆列表。为此,在 w 的计算中增加一些簿记。






我希望信息足以使您朝正确的方向前进。



或者,您可以根据二进制变量和二次目标函数编写问题,然后在D-Wave量子退火仪上解决问题:-p如果您想了解更多有关此信息,请给我发消息。


I'm looking to maximize the number of stars given a certain budget and max limit on the combination.

Example question:

With a budget of 500 euro, visiting only the maximum allowed restaurants or less, dine and collect the most stars possible.

I'm looking to write an efficient algorithm, that could potentially process 1 million Restaurant instances for up to 10 max Restaurants.

Note, this is a cross post from a question I asked yesterday: Java: Get the most efficient combination of a large List of objects based on a field

The solution below will assign 15$ per star to the r8 Restaurant, which means that when generating the list, it puts that into the list first, and with the remaining 70$ it can only get 2 more stars giving a total of 4 stars. However, if it was smart enough to skip the r8 restaurant ( even though it's the best dollar per star ratio ) the r1 restaurant would actually be a better choice for the budget, as it's 100$ cost and 5 stars.

Can anyone help attempt the problem and beat the current solution?

import itertools

class Restaurant():
  def __init__(self, cost, stars):
    self.cost = cost
    self.stars = stars
    self.ratio = cost / stars

  def display(self):
    print("Cost: $" + str(self.cost))
    print("Stars: " + str(self.stars))
    print()

r1 = Restaurant(100, 5)
r2 = Restaurant(140, 3)
r3 = Restaurant(90, 4)
r4 = Restaurant(140, 3)
r5 = Restaurant(120, 4)
r6 = Restaurant(60, 1)
r7 = Restaurant(40, 1)
r8 = Restaurant(30, 2)
r9 = Restaurant(70, 2)
r10 = Restaurant(250, 5)

print()
print("***************")
print("** Unsorted: **")
print("***************")
print()

restaurants = [r1, r2, r3, r4, r5, r6, r7, r8, r9, r10]

for restaurant in restaurants:
  print(restaurant.ratio, restaurant.stars)

print()
print("***************")
print("**  Sorted:  **")
print("***************")
print()

sorted_restaurants = sorted(restaurants, key = lambda x: x.ratio, reverse = True)

for restaurant in sorted_restaurants:
  print(restaurant.ratio, restaurant.stars)

print()
print("*********************")
print("** Begin Rucksack: **")
print("*********************")
print()

max = 5
budget = 100

spent = 0
quantity = 0

rucksack = []

for i in itertools.count():

  if len(rucksack) >= max or i == len(sorted_restaurants):
    break

  sorted_restaurants[i].display()

  if sorted_restaurants[i].cost + spent <= budget:
    spent = spent + sorted_restaurants[i].cost
    rucksack.append(sorted_restaurants[i])
  
print("Total Cost: $" + str(sum([x.cost for x in rucksack])))
print("Total Stars: " + str(sum([x.stars for x in rucksack])))

print()
print("*****************")
print("** Final List: **")
print("*****************")
print()

for restaurant in rucksack:
  restaurant.display()

解决方案

Sounds like your problem is pretty much the same as the Knapsack problem: Maximize value given certain weight and volume constraints. Basically value = total stars, weight = price, rucksack limit = total budget. Now there's an additional constraint of total "items" (restaurant visits) but that doesn't change the gist.

As you may or may not know, the knapsack problem is NP hard, which means no algorithm with polynomial time scaling is known.

However, there may be efficient pseudopolynomial algorithms using dynamic programming, and of course there are efficient heuristics, such as the "greedy" heuristic you seem to have discovered. This heuristic involves starting to fill up with the highest "density" items (most stars per buck) first. As you have seen, this heuristic fails to find the true optimum in some cases.

The dynamic programming approach should be pretty good here. It's based on a recursion: Given a budget B and a number of remaining visits V, what is the best set of restaurants to visit out of a total set of restaurants R?

See here: https://en.wikipedia.org/wiki/Knapsack_problem#0/1_knapsack_problem

Basically we define an array m for "max stars", where m[i, b, v] is the maximum amount of stars we can get when we are allowed to visits restaurants up to (and including) restaurant number i, spending at most b, and visiting at most v restaurants (the limit).

Now, we bottom-up fill this array. For example, m[0, b, v] = 0 for all values of b and v because if we can't go to any restaurants, we can't get any stars.

Also, m[i, b, 0] = 0 for all values of i and b because if we used up all our visits, we can't get any more stars.

Next line isn't too hard either:

m[i, b, v] = m[i - 1, b, v] if p[i] > b where p[i] is the price of dining at restaurant i. What does this line say? Well, if restaurant i is more expensive than we have money left (b) then we can't go there. Which means the max amount of stars we can get is the same whether we include restaurants up to i or just up to i - 1.

Next line is a bit tricky:

m[i, b, v] = max(m[i-1, b, v]), m[i-1, b - p[i], v-1] + s[i]) if p[i] <= b

Phew. s[i] is the amount of stars you get from restaurant i btw.

What does this line say? It's the heart of the dynamic programming approach. When considering the max amount of stars we can get when looking at restaurants up to and including i, then in the resulting solution we either go there or we don't, and we "just" have to see which of these two paths leads to more stars:

If we don't go to restaurant i, then we keep the same amount of money and remaining visits. The max amount of stars we can get in this path is the same as if we didn't even look at restaurant i. That's the first part in the max.

But if we do go to restaurant i, then we're left with p[i] less money, one fewer visit, and s[i] more stars. That's the second part in the max.

Now the question is simple: which of the two is larger.

You can create this array and fill it with a relatively simple for loop (take inspiration from the wiki). This just gives you the amount of stars though, not the actual list of restaurants to visit. For that, add some extra bookkeeping to the calculation of w.


I hope that information is enough to set you off in the right direction.

Alternatively, you can write your problem in terms of binary variables and a quadratic objective function and solve it on the D-Wave quantum annelaer :-p Message me if you want to know more about that.

这篇关于根据字段获取大型对象列表的最有效组合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆