在 Tensorflow RNN 示例中从 id 获取单词 [英] getting word from id at Tensorflow RNN sample

查看:27
本文介绍了在 Tensorflow RNN 示例中从 id 获取单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试修改 Tensorflow 的 RNN 示例.

I'm trying to modify Tensorflow's RNN sample here.

https://www.tensorflow.org/versions/r0.8/tutorials/recurrent/index.html

在 ptb_word_lm.py 我猜他们正在输入单词索引的 int 数组 (m.input_data:x).

At ptb_word_lm.py I guess they are inputting int array of word index (m.input_data:x).

def run_epoch(session, m, data, eval_op, verbose=False):
  """Runs the model on the given data."""
  epoch_size = ((len(data) // m.batch_size) - 1) // m.num_steps
  start_time = time.time()
  costs = 0.0
  iters = 0
  state = m.initial_state.eval()
  for step, (x, y) in enumerate(reader.ptb_iterator(data, m.batch_size,
                                                    m.num_steps)):
    cost, state, _ = session.run([m.cost, m.final_state, eval_op],
                                 {m.input_data: x,
                                  m.targets: y,
                                  m.initial_state: state})

我想看到实际的字词而不是 ID,我怎样才能看到它们?

I'd like to see actual words instead of ids, how can I see them?

推荐答案

首先需要保留词汇(从单词到 id 的索引).

You need to retain vocabulary ( which is an index from word to id ) first.

在 main 的顶部,保留 reader.ptb_raw_data() 的第 4 个返回值,如下所示.

At the top of main, retain 4th returned value from reader.ptb_raw_data() like below.

raw_data = reader.ptb_raw_data(FLAGS.data_path)
train_data, valid_data, test_data, vocabulary = raw_data

然后将词汇表传递给 run_epoch().

Then pass the vocabulary to run_epoch().

test_perplexity = run_epoch(session, mtest, test_data, tf.no_op(), vocabulary)

在run_epoch()内部,当你想在x的第一步中将ids转换为words时,

Inside of the run_epoch(), when you want to convert ids to words in first step of x,

def run_epoch(session, m, data, eval_op, vocabulary, verbose=False):

...
for step, (x, y) in enumerate(...

message ="x: "
for i in range(0, m.num_steps):
    key = vocabulary.keys()[vocabulary.values().index(x[0][i])]
    message += key + " "

print(message)

希望有帮助.

这篇关于在 Tensorflow RNN 示例中从 id 获取单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆