使用tfprof分析TensorFlow [英] Profiling TensorFlow using tfprof

查看：226 发布时间：2020/11/20 0:53:59 tensorflow profiling gpu

本文介绍了使用tfprof分析TensorFlow的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试分析TensorFlow的计算/内存使用情况，发现 tfprof 是适合我的工具.但是，我无法获得所有运营商的FLOPS.

I am trying to profile computation/memory usage of TensorFlow and found that tfprof is a right tool for my purpose. However, I was not able to get FLOPS of all operators.

这是我在TensorFlow信息库(tensorflow/models/image/cifar10/cifar10_train.py)中使用cifar10教程进行tfprof教程后所做的事情:

Here is what I did following the tfprof tutorial using cifar10 tutorial in TensorFlow repository (tensorflow/models/image/cifar10/cifar10_train.py):

run_metadata = tf.RunMetadata()

_, loss_value = sess.run([train_op, loss],
        options=tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE),
        run_metadata=run_metadata)

op_log = tfprof_log_pb2.OpLog()

// TODO: add op information

tf.contrib.tfprof.tfprof_logger.write_op_log(
        tf.get_default_graph(),
        log_dir="/tmp/log_dir",
        op_log=op_log,
        run_meta=run_metadata)

tf.contrib.tfprof.model_analyzer.print_model_analysis(
        tf.get_default_graph(),
        run_metadata=run_metadata,
        op_log=op_log,
        tfprof_options=tf.contrib.tfprof.model_analyzer.FLOAT_OPS_OPTIONS)

结果是

Parsing GraphDef...
Parsing RunMetadata...
Parsing OpLog...
Preparing Views...

=========================Options=============================
-max_depth                  10000
-min_bytes                  0
-min_micros                 0
-min_params                 0
-min_float_ops              1
-device_regexes             .*
-order_by                   float_ops
-account_type_regexes       .*
-start_name_regexes         .*
-trim_name_regexes
-show_name_regexes          .*
-hide_name_regexes
-account_displayed_op_only  true
-select                     float_ops
-viz                        false
-dump_to_file

==================Model Analysis Report======================
_TFProfRoot (0/5.23b flops)
  conv2/Conv2D (3.77b/3.77b flops)
  conv1/Conv2D (707.79m/707.79m flops)
  gradients/local3/MatMul_grad/MatMul (226.49m/226.49m flops)
  gradients/local3/MatMul_grad/MatMul_1 (226.49m/226.49m flops)
  local3/MatMul (226.49m/226.49m flops)
  gradients/local4/MatMul_grad/MatMul (18.87m/18.87m flops)
  gradients/local4/MatMul_grad/MatMul_1 (18.87m/18.87m flops)
  local4/MatMul (18.87m/18.87m flops)
  conv1/BiasAdd (4.72m/4.72m flops)
  conv2/BiasAdd (1.18m/1.18m flops)
  gradients/softmax_linear/MatMul_grad/MatMul (491.52k/491.52k flops)
  gradients/softmax_linear/MatMul_grad/MatMul_1 (491.52k/491.52k flops)
  softmax_linear/MatMul (491.52k/491.52k flops)

======================End of Report==========================

但是，结果并不包含所有操作，例如最大池化，relu，转换层的渐变.可能未定义这些操作的触发器统计信息(RegisterStatistics('flops')).因此，为了提供运行时信息(如tfprof教程11)，我尝试创建OpLog(请参见上面的代码).

However, the result does not contain all of the ops such as max pooling, relu, gradient of conv layers. Maybe flops stats of those ops are not defined (RegisterStatistics('flops')). Therefore, to provide runtime information, as in the tfprof tutorial 11), I tried to create OpLog (See code above).

但是，我不确定如何添加操作信息(如何获取操作的条目名称?).有什么方法可以添加其中包含的 ALL 操作吗?

However, I am not sure how can I add op information (How can I get entry name of the ops?). Is there any way to add ALL ops it contains?

还是tfprof以外的任何其他工具?也许是NVIDIA提供的分析工具?

Or any other tool rather than tfprof? Perhaps profiling tool from NVIDIA?

使用tfprof分析TensorFlow [英] Profiling TensorFlow using tfprof

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用tfprof分析TensorFlow [英] Profiling TensorFlow using tfprof

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭