如何使用Keras TensorBoard回调进行网格搜索 [英] How to use Keras TensorBoard callback for grid search

查看:148
本文介绍了如何使用Keras TensorBoard回调进行网格搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Keras TensorBoard回调. 我想进行网格搜索,并在张量板上可视化每个模型的结果. 问题是不同运行的所有结果都合并在一起,损失图像这样混乱:

I'm using the Keras TensorBoard callback. I would like to run a grid search and visualize the results of each single model in the tensor board. The problem is that all results of the different runs are merged together and the loss plot is a mess like this:

如何重命名每次运行以具有类似于以下内容:

How can I rename each run to have something similar to this:

以下是网格搜索的代码:

Here the code of the grid search:

df = pd.read_csv('data/prepared_example.csv')

df = time_series.create_index(df, datetime_index='DATE', other_index_list=['ITEM', 'AREA'])

target = ['D']
attributes = ['S', 'C', 'D-10','D-9', 'D-8', 'D-7', 'D-6', 'D-5', 'D-4',
       'D-3', 'D-2', 'D-1']

input_dim = len(attributes)
output_dim = len(target)

x = df[attributes]
y = df[target]

param_grid = {'epochs': [10, 20, 50],
              'batch_size': [10],
              'neurons': [[10, 10, 10]],
              'dropout': [[0.0, 0.0], [0.2, 0.2]],
              'lr': [0.1]}

estimator = KerasRegressor(build_fn=create_3_layers_model,
                           input_dim=input_dim, output_dim=output_dim)


tbCallBack = TensorBoard(log_dir='./Graph', histogram_freq=0, write_graph=True, write_images=False)

grid = GridSearchCV(estimator=estimator, param_grid=param_grid, n_jobs=-1, scoring=bug_fix_score,
                            cv=3, verbose=0, fit_params={'callbacks': [tbCallBack]})

grid_result = grid.fit(x.as_matrix(), y.as_matrix())

推荐答案

我认为没有任何方法可以将每次运行"参数传递给GridSearchCV.也许最简单的方法是将KerasRegressor子类化以完成您想要的事情.

I don't think there is any way to pass a "per-run" parameter to GridSearchCV. Maybe the easiest approach would be to subclass KerasRegressor to do what you want.

class KerasRegressorTB(KerasRegressor):

    def __init__(self, *args, **kwargs):
        super(KerasRegressorTB, self).__init__(*args, **kwargs)

    def fit(self, x, y, log_dir=None, **kwargs):
        cbs = None
        if log_dir is not None:
            params = self.get_params()
            conf = ",".join("{}={}".format(k, params[k])
                            for k in sorted(params))
            conf_dir = os.path.join(log_dir, conf)
            cbs = [TensorBoard(log_dir=conf_dir, histogram_freq=0,
                               write_graph=True, write_images=False)]
        super(KerasRegressorTB, self).fit(x, y, callbacks=cbs, **kwargs)

您将以如下方式使用它:

You would use it like:

# ...

estimator = KerasRegressorTB(build_fn=create_3_layers_model,
                             input_dim=input_dim, output_dim=output_dim)

#...

grid = GridSearchCV(estimator=estimator, param_grid=param_grid,
n_jobs=1, scoring=bug_fix_score,
                  cv=2, verbose=0, fit_params={'log_dir': './Graph'})

grid_result = grid.fit(x.as_matrix(), y.as_matrix())

更新:

由于交叉验证,GridSearchCV多次运行相同的模型(即参数的相同配置),因此先前的代码最终将在每次运行中放置多个跟踪.查看源代码(此处此处 ),似乎没有办法检索当前拆分ID".同时,您不应该只是检查现有文件夹并根据需要添加子修补程序,因为这些作业是并行运行的(至少有可能,尽管我不确定Keras/TF是否会并行运行).您可以尝试这样的事情:

Since GridSearchCV runs the same model (i.e. the same configuration of parameters) more than once due to cross-validation, the previous code will end up putting multiple traces in each run. Looking at the source (here and here), there doesn't seem to be a way to retrieve the "current split id". At the same time, you shouldn't just check for existing folders and add subfixes as needed, because the jobs run (potentially at least, although I'm not sure if that's the case with Keras/TF) in parallel. You can try something like this:

import itertools
import os

class KerasRegressorTB(KerasRegressor):

    def __init__(self, *args, **kwargs):
        super(KerasRegressorTB, self).__init__(*args, **kwargs)

    def fit(self, x, y, log_dir=None, **kwargs):
        cbs = None
        if log_dir is not None:
            # Make sure the base log directory exists
            try:
                os.makedirs(log_dir)
            except OSError:
                pass
            params = self.get_params()
            conf = ",".join("{}={}".format(k, params[k])
                            for k in sorted(params))
            conf_dir_base = os.path.join(log_dir, conf)
            # Find a new directory to place the logs
            for i in itertools.count():
                try:
                    conf_dir = "{}_split-{}".format(conf_dir_base, i)
                    os.makedirs(conf_dir)
                    break
                except OSError:
                    pass
            cbs = [TensorBoard(log_dir=conf_dir, histogram_freq=0,
                               write_graph=True, write_images=False)]
        super(KerasRegressorTB, self).fit(x, y, callbacks=cbs, **kwargs)

我正在使用os呼吁实现Python 2兼容性,但是如果您正在使用Python 3,则可以考虑使用更好的

I'm using os calls for Python 2 compatibility, but if you are using Python 3 you may consider the nicer pathlib module for path and directory handling.

注意:我忘记了前面提到的内容,但以防万一,请注意,传递write_graph=True将记录每次运行,根据您的模型,这可能意味着很多(相对说)这个空间.尽管我不知道该功能所需的空间,但write_images也是一样.

Note: I forgot to mention it earlier, but just in case, note that passing write_graph=True will log a graph per run, which, depending on your model, could mean a lot (relatively speaking) of this space. The same would apply to write_images, although I don't know the space that feature requires.

这篇关于如何使用Keras TensorBoard回调进行网格搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆