了解TensorFlow中的Op注册和内核链接 [英] Understand Op Registration and Kernel Linking in TensorFlow

查看:484
本文介绍了了解TensorFlow中的Op注册和内核链接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对TensorFlow还是陌生的,现在正在研究定制op开发.我已经阅读过官方教程,但是我觉得很多事情在幕后发生,我并不总是希望将自定义操作放在 user_ops 目录中.

因此,我使用了示例 word2vec

使用自定义的"Skipgram" op,其注册在此处定义:
/word2vec_o​​ps.cc
并且其内核实现在这里:
/word2vec_kernels.cc

看着构建文件,我试图构建单个目标

1)bazel build -c opt tensorflow/models/embedding:word2vec_ops
这将按预期生成一堆目标文件.

2)bazel build -c opt tensorflow/models/embedding:word2vec_kernels
一样.

3)bazel build -c opt tensorflow/models/embedding:word2vec_kernels:gen_word2vec

最后一个构建使用自定义规则,即tf_op_gen_wrapper_py https://github.com/tensorflow/tensorflow/blob /master/tensorflow/tensorflow.bzl#L197-L231

有趣的是,这仅取决于Op注册,而不取决于内核本身.

毕竟,如果我使用本身构建py_binary

bazel build -c opt tensorflow/models/embedding:word2vec

它工作正常,但是我看不到内核c ++代码在哪里以及如何链接?

此外,我还想了解tf_op_gen_wrapper_py规则以及ops注册幕后的整个编译/链接过程.

谢谢.

解决方案

注册"op" ,其中涉及为操作

定义接口
  • ,这涉及到定义操作的实现,也许使用针对不同数据类型或设备类型(例如CPU或GPU)的专门实现.

  • 这两个步骤都涉及编写C ++代码. 注册操作使用 REGISTER_OP(),然后使用.这些宏创建静态初始化程序,该初始化程序在加载包含它们的模块时运行. op和内核注册有两种主要机制:

    1. 静态链接到核心TensorFlow库,并进行静态初始化.

    2. 在运行时使用 tf.load_op_library() 函数.

    对于"Skipgram",我们使用选项1(静态链接).操作被链接到核心TensorFlow库此处,并且内核链接在此处中. (请注意,这不是理想的选择:word2vec op是在我们使用tf.load_op_library()之前创建的,因此没有动态链接它们的机制.)因此,当您第一次加载TensorFlow时(在).如果今天创建了它们,它们将被动态加载,从而仅在需要时才对其进行注册. (SyntaxNet代码具有动态加载的示例. )

    Bazel中的 tf_op_gen_wrapper_py()规则 op -library依赖项列表,并为这些ops生成Python包装器.该规则仅取决于op注册的原因是Python包装程序完全由op的接口确定,而op的接口在op注册中定义.值得注意的是,Python接口不知道是否存在针对特定类型或设备的专用内核.包装器生成器将操作注册链接到简单的C ++二进制为每个已注册的操作生成Python代码. 请注意,如果使用tf.load_op_library(),则无需自己调用包装器生成器,因为tf.load_op_library()将在运行时生成必要的代码.

    I am fairly new to TensorFlow and right now looking into the custom op development. I have already read the official tutorial but I feel a lot of things happen behind the scenes and I do not always want to put my custom ops in user_ops directory.

    As such, I took up an example word2vec

    which uses a custom "Skipgram" op whose registration is defined here:
    /word2vec_ops.cc
    and whose kernel implementation is here:
    /word2vec_kernels.cc

    Looking at the build file, I tried to build individual targets

    1) bazel build -c opt tensorflow/models/embedding:word2vec_ops
    This generates bunch of object files as expected.

    2) bazel build -c opt tensorflow/models/embedding:word2vec_kernels
    Same for this.

    3) bazel build -c opt tensorflow/models/embedding:word2vec_kernels:gen_word2vec

    This last build uses a custom rule namely tf_op_gen_wrapper_py https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tensorflow.bzl#L197-L231

    Interesting to note that this only depends on Op Registration and not on the kernel itself.

    After all above, if I build py_binary itself using

    bazel build -c opt tensorflow/models/embedding:word2vec

    it works fine, but I fail to see where and how the kernel c++ code linked?

    Additionally, I would also like to understand the tf_op_gen_wrapper_py rule and the whole compilation/linking procedure that goes behind the scenes for ops registration.

    Thanks.

    解决方案

    When adding a new kind of operation to TensorFlow, there are two main steps:

    1. Registering the "op", which involves defining an interface for the operation, and

    2. Registering one or more "kernels", which involves defining implementation(s) for the operation, perhaps with specialized implementations for different data types, or device types (like CPU or GPU).

    Both steps involve writing C++ code. Registering an op uses the REGISTER_OP() macro, and registering a kernel uses the REGISTER_KERNEL_BUILDER() macro. These macros create static initializers that run when the module containing them is loaded. There are two main mechanisms for op and kernel registration:

    1. Static linking into the core TensorFlow library, and static initialization.

    2. Dynamic linking at runtime, using the tf.load_op_library() function.

    In the case of "Skipgram", we use option 1 (static linking). The ops are linked into the core TensorFlow library here, and the kernels are linked in here. (Note that this is not ideal: the word2vec ops were created before we had tf.load_op_library(), and so there was no mechanism for linking them dynamically.) Hence the ops and kernels are registered when you first load TensorFlow (in import tensorflow as tf). If they were created today, they would be dynamically loaded, such that they would only be registered if they were needed. (The SyntaxNet code has an example of dynamic loading.)

    The tf_op_gen_wrapper_py() rule in Bazel takes a list of op-library dependencies and generates Python wrappers for those ops. The reason that this rule depends only on the op registration is that the Python wrappers are determined entirely by the interface of the op, which is defined in the op registration. Notably, the Python interface has no idea whether there are specialized kernels for a particular type or device. The wrapper generator links the op registrations into a simple C++ binary that generates Python code for each of the registered ops. Note that, if you use tf.load_op_library(), you do not need to invoke the wrapper generator yourself, because tf.load_op_library() will generate the necessary code at runtime.

    这篇关于了解TensorFlow中的Op注册和内核链接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆