为什么我的keras LSTM模型陷入无限循环? [英] Why does my keras LSTM model get stuck in an infinite loop?

查看:229
本文介绍了为什么我的keras LSTM模型陷入无限循环?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试构建一个小的LSTM,该LSTM可以通过在现有的Python代码上进行训练来学习编写代码(即使它是垃圾代码).我已经将几千行代码连接到一个文件中,跨越了数百个文件,每个文件都以<eos>结尾以表示序列结束".

I am trying to build a small LSTM that can learn to write code (even if it's garbage code) by training it on existing Python code. I have concatenated a few thousand lines of code together in one file across several hundred files, with each file ending in <eos> to signify "end of sequence".

例如,我的训练文件如下:

As an example, my training file looks like:


setup(name='Keras',
...
      ],
      packages=find_packages())
<eos>
import pyux
...
with open('api.json', 'w') as f:
    json.dump(sign, f)
<eos>

我正在使用以下单词创建令牌:

I am creating tokens from the words with:

file = open(self.textfile, 'r')
filecontents = file.read()
file.close()
filecontents = filecontents.replace("\n\n", "\n")
filecontents = filecontents.replace('\n', ' \n ')
filecontents = filecontents.replace('    ', ' \t ')

text_in_words = [w for w in filecontents.split(' ') if w != '']

self._words = set(text_in_words)
    STEP = 1
    self._codelines = []
    self._next_words = []
    for i in range(0, len(text_in_words) - self.seq_length, STEP):
        self._codelines.append(text_in_words[i: i + self.seq_length])
        self._next_words.append(text_in_words[i + self.seq_length])

我的keras模型是:

model = Sequential()
model.add(Embedding(input_dim=len(self._words), output_dim=1024))

model.add(Bidirectional(
    LSTM(128), input_shape=(self.seq_length, len(self._words))))

model.add(Dropout(rate=0.5))
model.add(Dense(len(self._words)))
model.add(Activation('softmax'))

model.compile(loss='sparse_categorical_crossentropy',
              optimizer="adam", metrics=['accuracy'])

但是无论我训练多少,该模型似乎都不会生成<eos>甚至\n.我认为可能是因为我的LSTM大小是128,而我的seq_length是200,但这不太有意义吗?有什么我想念的吗?

But no matter how much I train it, the model never seems to generate <eos> or even \n. I think it might be because my LSTM size is 128 and my seq_length is 200, but that doesn't quite make sense? Is there something I'm missing?

推荐答案

有时,当没有limit for code generationthe <EOS> or <SOS> tokens are not numerical tokens时,LSTM永远不会收敛.如果您可以发送输出或错误消息,则调试起来会容易得多.

Sometimes, when there is no limit for code generation or the <EOS> or <SOS> tokens are not numerical tokens LSTM never converges. If you could send your outputs or error messages, it would be much easier to debug.

您可以创建一个额外的类来获取单词和句子.

You could create an extra class for getting words and sentences.

# tokens for start of sentence(SOS) and end of sentence(EOS)

SOS_token = 0
EOS_token = 1


class Lang:
    '''
    class for word object, storing sentences, words and word counts.
    '''
    def __init__(self, name):
        self.name = name
        self.word2index = {}
        self.word2count = {}
        self.index2word = {0: "SOS", 1: "EOS"}
        self.n_words = 2  # Count SOS and EOS

    def addSentence(self, sentence):
        for word in sentence.split(' '):
            self.addWord(word)

    def addWord(self, word):
        if word not in self.word2index:
            self.word2index[word] = self.n_words
            self.word2count[word] = 1
            self.index2word[self.n_words] = word
            self.n_words += 1
        else:
            self.word2count[word] += 1

然后,在生成文本时,只需添加<SOS>令牌即可. 您可以使用 https://github.com/sherjilozair/char-rnn-tensorflow ,字符级别rnn供参考.

Then, while generating text, just adding a <SOS> token would do. You can use https://github.com/sherjilozair/char-rnn-tensorflow , a character level rnn for reference.

这篇关于为什么我的keras LSTM模型陷入无限循环?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆