机器学习算法预测事件的顺序? [英] Machine Learning Algorithm for Predicting Order of Events?

查看:540
本文介绍了机器学习算法预测事件的顺序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

简单的机器学习问题。可能有很多方法来解决这个问题:



有4种可能的事件无限


$ b b

'event_1','event_2','event_4','event_4'



不是完全随机的顺序。我们将假设大多数事件进入的顺序有一些复杂的模式,其余的事件只是随机的。我们不知道提前的模式。



在收到每个事件后,我想根据事件的顺序预测下一个事件在过去。所以我的问题是:我应该为此预测变量使用什么机器学习算法?



然后,预测变量会被告知下一个事件:

  Predictor = new_predictor()

prev_event = False
而True:
event = get_event()
如果prev_event不是False:
Predictor.last_event_was(prev_event)
predicted_event = Predictor.predict_next_event(event)

问题出现了预测器应该维护多长时间的历史,因为维护无限历史是不可能的。我会留给你回答。



因此,我相信预测必须用某种滚动历史来完成。因此,添加新事件和过期事件应该非常有效,并且不需要重建整个预测变量模型。



具体代码,而不是研究论文,会为我添加巨大的价值

更新:如果一个以上的事件可以同时在每个回合中发生,该怎么办?

解决方案

这本质上是一个序列预测问题,所以你想要循环的神经网络或隐马尔可夫模型。



如果您只有固定的时间回顾,时间窗口方法可能就足够了。您获取序列数据并将其分割为长度为n的重叠窗口。 (例如,将序列ABCDEFG分割为ABC,BCD,CDE,DEF,EFG)。然后你训练一个函数近似(例如神经网络或线性回归)将该窗口的前n-1部分映射到第n部分。



你的预测变量不会能够回溯时间比你的窗口的大小更长。 RNN和HMM可以在理论上这样做,但很难调整或有时只是不工作。



(现有的RNN实现可以在PyBrain < a href =http://pybrain.org =nofollow noreferrer> http://pybrain.org )



更新:这里是你的问题的pybrain代码。 (我没有测试它,可能有一些错字和东西,但整体结构应该工作。)

  from pybrain .datasets import SequentialDataSet 
from pybrain.supervised.trainers import BackpropTrainer
from pybrain.tools.shortcuts import buildNetwork
from pybrain.structure import SigmoidLayer

INPUTS = 4
HIDDEN = 10
OUTPUTS = 4

net = buildNetwork(INPUTS,HIDDEN,OUTPUTS,hiddenclass = LSTMLayer,outclass = SigmoidLayer,recurrent = True)
$ b b ds = SequentialDataSet(INPUTS,OUTPUTS)

#your_sequences是每个都是位掩码的元组列表的列表
#指示事件(因此,如果事件i发生,位置i为1.0, 0.0其他)

for_sequences中的序列:
for(inpt,target)in zip(sequence,sequence [1:]):
ds.newSequence()
ds.appendLinked(inpt,target)

net.randomize()

trainer = BackpropTrainer(net,ds,learningrate = 0.05,momentum = 0.99)
for _ in range(1000):
print trainer.train()

循环网络为1000个纪元,并在每个历元之后打印出错误。之后,您可以检查如下的正确预测:

  net.reset()
for i的顺序:
next_item = net.activate(i)> 0.5
print next_item

这将为每个事件打印一个布尔数组。


Simple machine learning question. Probably numerous ways to solve this:

There is an infinite stream of 4 possible events:

'event_1', 'event_2', 'event_4', 'event_4'

The events do not come in in completely random order. We will assume that there are some complex patterns to the order that most events come in, and the rest of the events are just random. We do not know the patterns ahead of time though.

After each event is received, I want to predict what the next event will be based on the order that events have come in in the past. So my question is: What machine learning algorithm should I use for this predictor?

The predictor will then be told what the next event actually was:

Predictor=new_predictor()

prev_event=False
while True:
    event=get_event()
    if prev_event is not False:
        Predictor.last_event_was(prev_event)
    predicted_event=Predictor.predict_next_event(event)

The question arises of how long of a history that the predictor should maintain, since maintaining infinite history will not be possible. I'll leave this up to you to answer. The answer can't be infinte though for practicality.

So I believe that the predictions will have to be done with some kind of rolling history. Adding a new event and expiring an old event should therefore be rather efficient, and not require rebuilding the entire predictor model, for example.

Specific code, instead of research papers, would add for me immense value to your responses. Python or C libraries are nice, but anything will do.

Update: And what if more than one event can happen simultaneously on each round. Does that change the solution?

解决方案

This is essentially a sequence prediction problem, so you want Recurrent neural networks or hidden Markov models.

If you only have a fixed time to look back, time window approaches might suffice. You take the sequence data and split it into overlapping windows of length n. (eg. you split a sequence ABCDEFG into ABC, BCD, CDE, DEF, EFG). Then you train a function approximator (e.g. neural network or linear regression) to map the first n-1 parts of that window onto the nth part.

Your predictor will not be able to look back in time longer than the size of your window. RNNs and HMMs can do so in theory, but are hard to tune or sometimes just don't work.

(State of the art RNN implementations can be found in PyBrain http://pybrain.org)

Update: Here is the pybrain code for your problem. (I haven't tested it, there might be some typos and stuff, but the overall structure should work.)

from pybrain.datasets import SequentialDataSet
from pybrain.supervised.trainers import BackpropTrainer
from pybrain.tools.shortcuts import buildNetwork
from pybrain.structure import SigmoidLayer

INPUTS = 4
HIDDEN = 10
OUTPUTS = 4

net = buildNetwork(INPUTS, HIDDEN, OUTPUTS, hiddenclass=LSTMLayer, outclass=SigmoidLayer, recurrent=True)

ds = SequentialDataSet(INPUTS, OUTPUTS)

# your_sequences is a list of lists of tuples which each are a bitmask
# indicating the event (so 1.0 at position i if event i happens, 0.0 otherwise)

for sequence in your_sequences:
    for (inpt, target) in zip(sequence, sequence[1:]):
        ds.newSequence()
        ds.appendLinked(inpt, target)

net.randomize()

trainer = BackpropTrainer(net, ds, learningrate=0.05, momentum=0.99)
for _ in range(1000):
    print trainer.train()

This will train the recurrent network for 1000 epochs and print out the error after every epochs. Afterwards you can check for correct predictions like this:

net.reset()
for i in sequence:
  next_item = net.activate(i) > 0.5
  print next_item

This will print an array of booleans for every event.

这篇关于机器学习算法预测事件的顺序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆