TensorFlow是否为其用户实施了交叉验证? [英] Does TensorFlow have cross validation implemented for its users?

查看:1048
本文介绍了TensorFlow是否为其用户实施了交叉验证?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我当时正在考虑尝试使用交叉验证选择超级参数(例如正则化),或者训练模型的多个初始化,然后选择具有最高交叉验证准确性的模型.实施k倍或CV很简单,但很麻烦/烦人(特别是如果我尝试在不同的CPU,GPU甚至不同的计算机等上训练不同的模型).我希望TensorFlow之类的库为其用户实现类似的实现,这样我们就不必对同一事物进行100次编码.因此,TensorFlow是否具有一个库或可以帮助我进行交叉验证的东西?

I was thinking of trying to choose hyper parameters (like regularization for example) using cross validation or maybe train multiple initializations of a models and then choose the model with highest cross validation accuracy. Implementing k-fold or CV is simple but tedious/annoying (specially if I am trying to train different models in different CPU's, GPU's or even different computers etc). I would expect a library like TensorFlow to have something like this implemented for its user so that we don't have to code the same thing 100 times. Thus, does TensorFlow have a library or something that can help me do Cross Validation?

作为一个更新,似乎可以使用scikit learning或其他方法来做到这一点.如果是这样的话,那么如果有人可以提供一个简单的NN培训和scikit交叉验证的例子,那就太棒了!不确定是否可以扩展到多个CPU,GPU,群集等.

As an update, it seems one could use scikit learn or something else to do this. If this is the case, then if anyone can provide a simple example of NN training and cross validation with scikit learn it would be awesome! Not sure if this scales to multiple cpus, gpus, clusters etc though.

推荐答案

已经讨论过,tensorflow并没有提供自己的交叉验证模型的方法.推荐的方法是使用 KFold .这有点乏味,但可行.这是使用tensorflowKFold交叉验证MNIST模型的完整示例:

As already discussed, tensorflow doesn't provide its own way to cross-validate the model. The recommended way is to use KFold. It's a bit tedious, but doable. Here's a complete example of cross-validating MNIST model with tensorflow and KFold:

from sklearn.model_selection import KFold
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

# Parameters
learning_rate = 0.01
batch_size = 500

# TF graph
x = tf.placeholder(tf.float32, [None, 784])
y = tf.placeholder(tf.float32, [None, 10])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
pred = tf.nn.softmax(tf.matmul(x, W) + b)
cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
init = tf.global_variables_initializer()

mnist = input_data.read_data_sets("data/mnist-tf", one_hot=True)
train_x_all = mnist.train.images
train_y_all = mnist.train.labels
test_x = mnist.test.images
test_y = mnist.test.labels

def run_train(session, train_x, train_y):
  print "\nStart training"
  session.run(init)
  for epoch in range(10):
    total_batch = int(train_x.shape[0] / batch_size)
    for i in range(total_batch):
      batch_x = train_x[i*batch_size:(i+1)*batch_size]
      batch_y = train_y[i*batch_size:(i+1)*batch_size]
      _, c = session.run([optimizer, cost], feed_dict={x: batch_x, y: batch_y})
      if i % 50 == 0:
        print "Epoch #%d step=%d cost=%f" % (epoch, i, c)

def cross_validate(session, split_size=5):
  results = []
  kf = KFold(n_splits=split_size)
  for train_idx, val_idx in kf.split(train_x_all, train_y_all):
    train_x = train_x_all[train_idx]
    train_y = train_y_all[train_idx]
    val_x = train_x_all[val_idx]
    val_y = train_y_all[val_idx]
    run_train(session, train_x, train_y)
    results.append(session.run(accuracy, feed_dict={x: val_x, y: val_y}))
  return results

with tf.Session() as session:
  result = cross_validate(session)
  print "Cross-validation result: %s" % result
  print "Test accuracy: %f" % session.run(accuracy, feed_dict={x: test_x, y: test_y})

这篇关于TensorFlow是否为其用户实施了交叉验证?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆