预测事件顺序的机器学习算法? [英] Machine Learning Algorithm for Predicting Order of Events?

查看:37
本文介绍了预测事件顺序的机器学习算法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

简单的机器学习问题.可能有很多方法可以解决这个问题:

Simple machine learning question. Probably numerous ways to solve this:

无限 4 个可能的事件流:

There is an infinite stream of 4 possible events:

'event_1'、'event_2'、'event_4'、'event_4'

事件并非以完全随机的顺序出现.我们将假设大多数事件进入的顺序有一些复杂的模式,其余的事件只是随机的.不过,我们并不提前知道这些模式.

The events do not come in in completely random order. We will assume that there are some complex patterns to the order that most events come in, and the rest of the events are just random. We do not know the patterns ahead of time though.

在收到每个事件后,我想根据过去事件发生的顺序来预测下一个事件将是什么.所以我的问题是:我应该为这个预测器使用什么机器学习算法?

After each event is received, I want to predict what the next event will be based on the order that events have come in in the past. So my question is: What machine learning algorithm should I use for this predictor?

然后预测者将被告知下一个事件实际上是什么:

The predictor will then be told what the next event actually was:

Predictor=new_predictor()

prev_event=False
while True:
    event=get_event()
    if prev_event is not False:
        Predictor.last_event_was(prev_event)
    predicted_event=Predictor.predict_next_event(event)

问题是预测器应该保持多长时间的历史,因为不可能保持无限的历史.我会把这个留给你来回答.但从实用性来看,答案不可能是无限的.

The question arises of how long of a history that the predictor should maintain, since maintaining infinite history will not be possible. I'll leave this up to you to answer. The answer can't be infinte though for practicality.

所以我相信预测必须通过某种滚动历史来完成.因此,添加新事件并使旧事件过期应该是相当有效的,例如,不需要重建整个预测模型.

So I believe that the predictions will have to be done with some kind of rolling history. Adding a new event and expiring an old event should therefore be rather efficient, and not require rebuilding the entire predictor model, for example.

具体的代码,而不是研究论文,会为我增加巨大的价值到你的回复中.Python 或 C 库很好,但什么都行.

Specific code, instead of research papers, would add for me immense value to your responses. Python or C libraries are nice, but anything will do.

更新:如果每一轮可以同时发生多个事件怎么办.这会改变解决方案吗?

Update: And what if more than one event can happen simultaneously on each round. Does that change the solution?

推荐答案

这本质上是一个序列预测问题,所以你需要循环神经网络或隐马尔可夫模型.

This is essentially a sequence prediction problem, so you want Recurrent neural networks or hidden Markov models.

如果您只有固定的时间回顾,时间窗口法可能就足够了.您获取序列数据并将其拆分为长度为 n 的重叠窗口.(例如,您将序列 ABCDEFG 拆分为 ABC、BCD、CDE、DEF、EFG).然后训练一个函数逼近器(例如神经网络或线性回归)将该窗口的前 n-1 部分映射到第 n 部分.

If you only have a fixed time to look back, time window approaches might suffice. You take the sequence data and split it into overlapping windows of length n. (eg. you split a sequence ABCDEFG into ABC, BCD, CDE, DEF, EFG). Then you train a function approximator (e.g. neural network or linear regression) to map the first n-1 parts of that window onto the nth part.

您的预测器回溯的时间不会超过您的窗口大小.RNN 和 HMM 理论上可以这样做,但很难调整或有时根本不起作用.

Your predictor will not be able to look back in time longer than the size of your window. RNNs and HMMs can do so in theory, but are hard to tune or sometimes just don't work.

(最先进的 RNN 实现可以在 PyBrain http://pybrain.org 中找到)

(State of the art RNN implementations can be found in PyBrain http://pybrain.org)

更新:这是您的问题的 pybrain 代码.(我还没有测试过,可能有一些错别字和东西,但整体结构应该可以.)

Update: Here is the pybrain code for your problem. (I haven't tested it, there might be some typos and stuff, but the overall structure should work.)

from pybrain.datasets import SequentialDataSet
from pybrain.supervised.trainers import BackpropTrainer
from pybrain.tools.shortcuts import buildNetwork
from pybrain.structure import SigmoidLayer

INPUTS = 4
HIDDEN = 10
OUTPUTS = 4

net = buildNetwork(INPUTS, HIDDEN, OUTPUTS, hiddenclass=LSTMLayer, outclass=SigmoidLayer, recurrent=True)

ds = SequentialDataSet(INPUTS, OUTPUTS)

# your_sequences is a list of lists of tuples which each are a bitmask
# indicating the event (so 1.0 at position i if event i happens, 0.0 otherwise)

for sequence in your_sequences:
    for (inpt, target) in zip(sequence, sequence[1:]):
        ds.newSequence()
        ds.appendLinked(inpt, target)

net.randomize()

trainer = BackpropTrainer(net, ds, learningrate=0.05, momentum=0.99)
for _ in range(1000):
    print trainer.train()

这将训练循环网络 1000 个 epoch,并在每个 epoch 后打印出错误.之后,您可以像这样检查正确的预测:

This will train the recurrent network for 1000 epochs and print out the error after every epochs. Afterwards you can check for correct predictions like this:

net.reset()
for i in sequence:
  next_item = net.activate(i) > 0.5
  print next_item

这将为每个事件打印一组布尔值.

This will print an array of booleans for every event.

这篇关于预测事件顺序的机器学习算法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆