如何处理猜数字游戏(带有扭曲)算法? [英] How to approach a number guessing game (with a twist) algorithm?

查看:32
本文介绍了如何处理猜数字游戏(带有扭曲)算法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

更新(2020 年 7 月):这个问题已有 9 年历史了,但我仍然非常感兴趣.从那时起,机器学习(RNN、CNN、GANS 等)、新方法和廉价 GPU已经上升,使新的方法成为可能.我认为重新讨论这个问题以查看是否有新方法会很有趣.

我正在学习编程(Python 和算法),并试图从事一个我觉得有趣的项目.我已经创建了一些基本的 Python 脚本,但我不确定如何为我正在尝试构建的游戏找到解决方案.

I am learning programming (Python and algorithms) and was trying to work on a project that I find interesting. I have created a few basic Python scripts, but I’m not sure how to approach a solution to a game I am trying to build.

游戏的运行方式如下:

用户将获得带有值的项目.例如,

Users will be given items with a value. For example,

Apple = 1
Pears = 2
Oranges  = 3

然后他们将有机会选择他们喜欢的任何组合(即 100 个苹果、20 个梨和一个橙子).计算机获得的唯一输出是总价值(在本例中,当前为 143 美元).计算机将尝试猜测他们有什么.这显然无法正确获得第一回合.

They will then get a chance to choose any combo of them they like (i.e. 100 apples, 20 pears, and one orange). The only output the computer gets is the total value (in this example, it's currently $143). The computer will try to guess what they have. Which obviously it won’t be able to get correctly the first turn.

         Value    quantity(day1)    value(day1)
Apple      1        100                100
Pears      2         20                 40
Orange     3          1                  3
Total               121                143

下一回合用户可以修改他们的数字,但不能超过总数量的 5%(或我们可能选择的其他百分比.例如,我将使用 5%.).水果的价格可能会改变(随机),因此总价值也可能会因此而改变(为简单起见,我没有在这个例子中改变水果价格).使用上面的示例,在游戏的第 2 天,用户返回值 152 美元,第 3 天返回值 164 美元.示例如下:

The next turn the user can modify their numbers but no more than 5% of the total quantity (or some other percent we may chose. I’ll use 5% for example.). The prices of fruit can change(at random) so the total value may change based on that also (for simplicity I am not changing fruit prices in this example). Using the above example, on day 2 of the game, the user returns a value of $152 and $164 on day 3. Here's an example:

Quantity (day2)   %change (day2)    Value (day2)   Quantity (day3)   %change (day3)   Value(day3)
 104                                 104            106                                106
  21                                  42             23                                 46
   2                                   6              4                                 12
 127               4.96%             152            133               4.72%            164

*(我希望表格显示正确,我不得不手动将它们隔开,所以希望它不仅仅是在我的屏幕上做,如果它不起作用,请告诉我,我会尝试上传屏幕截图.)

*(I hope the tables show up right, I had to manually space them so hopefully it's not just doing it on my screen, if it doesn't work let me know and I'll try to upload a screenshot.)

我想看看我是否能弄清楚随着时间的推移数量是多少(假设用户有耐心继续输入数字).我现在知道我唯一的限制是总值不能超过 5%,所以我现在不能在 5% 以内,所以用户将永远输入它.

I am trying to see if I can figure out what the quantities are over time (assuming the user will have the patience to keep entering numbers). I know right now my only restriction is the total value cannot be more than 5% so I cannot be within 5% accuracy right now so the user will be entering it forever.

到目前为止我所做的

这是我目前的解决方案(不多).基本上,我获取所有值并找出它们的所有可能组合(我完成了这部分).然后我将所有可能的组合作为字典放在数据库中(例如,对于 143 美元,可能会有一个字典条目 {apple:143, Pears:0, Oranges :0}..all way to {apple:0, Pears:1, Oranges:47}.每次我得到一个新数字时我都会这样做,所以我有一个所有可能性的列表.

Here’s my solution so far (not much). Basically, I take all the values and figure out all the possible combinations of them (I am done this part). Then I take all the possible combos and put them in a database as a dictionary (so for example for $143, there could be a dictionary entry {apple:143, Pears:0, Oranges :0}..all the way to {apple:0, Pears:1, Oranges :47}. I do this each time I get a new number so I have a list of all possibilities.

这就是我被卡住的地方.在使用上述规则时,我如何找出最佳解决方案?我想我需要一个适应度函数,它可以自动比较两天的数据,并删除任何与前几天数据差异超过 5% 的可能性.

Here’s where I’m stuck. In using the rules above, how can I figure out the best possible solution? I think I’ll need a fitness function that automatically compares the two days data and removes any possibilities that have more than 5% variance of the previous days data.

问题:

所以我的问题是用户更改总数并且我有所有概率的列表,我应该如何解决这个问题?我需要学习什么?是否有任何适用的算法或我可以使用的理论?或者,为了帮助我理解我的错误,你能否建议我可以添加哪些规则来使这个目标可行(如果它不是当前状态.我正在考虑添加更多水果并说他们必须至少选择 3 个,等等.)?另外,我对遗传算法的理解也很模糊,但我想我可以在这里使用它们,如果有我可以使用的东西吗?

So my question with user changing the total and me having a list of all the probabilities, how should I approach this? What do I need to learn? Is there any algorithms out there or theories that I can use that are applicable? Or, to help me understand my mistake, can you suggest what rules I can add to make this goal feasible (if it's not in its current state. I was thinking adding more fruits and saying they must pick at least 3, etc..)? Also, I only have a vague understanding of genetic algorithms, but I thought I could use them here, if is there something I can use?

我非常非常渴望学习,所以任何建议或提示将不胜感激(只是请不要告诉我这个游戏是不可能的).

I'm very very eager to learn so any advice or tips would be greatly appreciated (just please don't tell me this game is impossible).

更新:收到反馈说这很难解决.所以我想我会在游戏中添加另一个条件,它不会干扰玩家正在做的事情(游戏对他们来说保持不变)但每天水果的价值都会改变价格(随机).这会更容易解决吗?因为在 5% 的变动和某些水果价值的变化内,随着时间的推移,只有少数组合是可能的.

UPDATE: Getting feedback that this is hard to solve. So I thought I'd add another condition to the game that won't interfere with what the player is doing (game stays the same for them) but everyday the value of the fruits change price (randomly). Would that make it easier to solve? Because within a 5% movement and certain fruit value changes, only a few combinations are probable over time.

第 1 天,一切皆有可能,获得足够接近的范围几乎是不可能的,但随着水果价格的变化,用户只能选择 5% 的变化,那么不应该(随着时间的推移)范围缩小和狭窄的.在上面的例子中,如果价格波动足够大,我想我可以暴力破解一个给我一个猜测范围的解决方案,但我试图弄清楚是否有更优雅的解决方案或其他解决方案来不断缩小这个范围时间.

Day 1, anything is possible and getting a close enough range is almost impossible, but as the prices of fruits change and the user can only choose a 5% change, then shouldn't (over time) the range be narrow and narrow. In the above example, if prices are volatile enough I think I could brute force a solution that gave me a range to guess in, but I'm trying to figure out if there's a more elegant solution or other solutions to keep narrowing this range over time.

UPDATE2:阅读并询问后,我相信这是一个隐藏的马尔可夫/维特比问题,它跟踪水果价格的变化以及总和(加权最后一个数据点最重).我不确定如何应用这种关系.我认为是这种情况并且可能是错误的,但至少我开始怀疑这是某种类型的机器学习问题.

UPDATE2: After reading and asking around, I believe this is a hidden Markov/Viterbi problem that tracks the changes in fruit prices as well as total sum (weighting the last data point the heaviest). I'm not sure how to apply the relationship though. I think this is the case and could be wrong but at the least I'm starting to suspect this is a some type of machine learning problem.

更新 3:我创建了一个测试用例(数字较小)和一个生成器来帮助自动生成用户生成的数据,我正在尝试从中创建一个图表以查看更有可能发生的情况.

Update 3: I am created a test case (with smaller numbers) and a generator to help automate the user generated data and I am trying to create a graph from it to see what's more likely.

这是代码,以及总值和对用户实际水果数量的评论.

Here's the code, along with the total values and comments on what the users actually fruit quantities are.

#!/usr/bin/env python
import itertools

# Fruit price data
fruitPriceDay1 = {'Apple':1, 'Pears':2, 'Oranges':3}
fruitPriceDay2 = {'Apple':2, 'Pears':3, 'Oranges':4}
fruitPriceDay3 = {'Apple':2, 'Pears':4, 'Oranges':5}

# Generate possibilities for testing (warning...will not scale with large numbers)
def possibilityGenerator(target_sum, apple, pears, oranges):
    allDayPossible = {}
    counter = 1
    apple_range = range(0, target_sum + 1, apple)
    pears_range = range(0, target_sum + 1, pears)
    oranges_range = range(0, target_sum + 1, oranges)
    for i, j, k in itertools.product(apple_range, pears_range, oranges_range):
        if i + j + k == target_sum:
            currentPossible = {}

            #print counter
            #print 'Apple', ':', i/apple, ',', 'Pears', ':', j/pears, ',', 'Oranges', ':', k/oranges
            currentPossible['apple'] = i/apple
            currentPossible['pears'] = j/pears
            currentPossible['oranges'] = k/oranges

            #print currentPossible
            allDayPossible[counter] = currentPossible
            counter = counter +1
    return allDayPossible

# Total sum being returned by user for value of fruits
totalSumDay1=26 # Computer does not know this but users quantities are apple: 20, pears 3, oranges 0 at the current prices of the day
totalSumDay2=51 # Computer does not know this but users quantities are apple: 21, pears 3, oranges 0 at the current prices of the day
totalSumDay3=61 # Computer does not know this but users quantities are apple: 20, pears 4, oranges 1 at the current prices of the day
graph = {}
graph['day1'] = possibilityGenerator(totalSumDay1, fruitPriceDay1['Apple'], fruitPriceDay1['Pears'], fruitPriceDay1['Oranges'] )
graph['day2'] = possibilityGenerator(totalSumDay2, fruitPriceDay2['Apple'], fruitPriceDay2['Pears'], fruitPriceDay2['Oranges'] )
graph['day3'] = possibilityGenerator(totalSumDay3, fruitPriceDay3['Apple'], fruitPriceDay3['Pears'], fruitPriceDay3['Oranges'] )

# Sample of dict = 1 : {'oranges': 0, 'apple': 0, 'pears': 0}..70 : {'oranges': 8, 'apple': 26, 'pears': 13}
print graph

推荐答案

我们将结合图论和概率:

We'll combine graph-theory and probability:

在第一天,构建一组所有可行的解决方案.让我们将解集表示为 A1={a1(1), a1(2),...,a1(n)}.

On the 1st day, build a set of all feasible solutions. Lets denote the solutions set as A1={a1(1), a1(2),...,a1(n)}.

第二天您可以再次构建解决方案集 A2.

On the second day you can again build the solutions set A2.

现在,对于 A2 中的每个元素,您需要检查它是否可以从 A1 的每个元素到达(给定 x% 容差).如果是这样 - 将 A2(n) 连接到 A1(m).如果无法从 A1(m) 中的任何节点访问它 - 您可以删除该节点.

Now, for each element in A2, you'll need to check if it can be reached from each element of A1 (given x% tolerance). If so - connect A2(n) to A1(m). If it can't be reached from any node in A1(m) - you can delete this node.

基本上我们正在构建一个连通的有向无环图.

Basically we are building a connected directed acyclic graph.

图中所有路径的可能性均等.只有当从 Am 到 Am+1(从 Am 中的一个节点到 Am+1 中的一个节点)只有一条边时,您才能找到精确解.

All paths in the graph are equally likely. You can find an exact solution only when there is a single edge from Am to Am+1 (from a node in Am to a node in Am+1).

当然,有些节点出现在比其他节点更多的路径中.可以根据包含该节点的路径数直接推导出每个节点的概率.

Sure, some nodes appear in more paths than other nodes. The probability for each node can be directly deduced based on the number of paths that contains this node.

通过为每个节点分配一个权重,该权重等于通向该节点的路径数,不需要保留所有历史记录,而只保留前一天.

By assigning a weight to each node, which equals to the number of paths that leads to this node, there is no need to keep all history, but only the previous day.

另外,看看 non-负值线性二番图方程 - 我刚才问的一个问题.接受的答案是在每个步骤中枚举所有组合的好方法.

Also, have a look at non-negative-values linear diphantine equations - A question I asked a while ago. The accepted answer is a great way to enumarte all combos in each step.

这篇关于如何处理猜数字游戏(带有扭曲)算法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆