TensorFlow,为什么选择 Python 语言? [英] TensorFlow, why was python the chosen language?

查看:30
本文介绍了TensorFlow,为什么选择 Python 语言?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近开始研究深度学习和其他 ML 技术,我开始寻找可以简化构建网络和训练它的过程的框架,然后我找到了 TensorFlow,在该领域几乎没有经验,对我来说,似乎如果使用深度学习,速度是使大型机器学习系统更加出色的一个重要因素,那么为什么 Google 选择 Python 来制作 TensorFlow?用一种可以编译不能解释的语言来实现它不是更好吗?

I recently started studying deep learning and other ML techniques, and I started searching for frameworks that simplify the process of build a net and training it, then I found TensorFlow, having little experience in the field, for me, it seems that speed is a big factor for making a big ML system even more if working with deep learning, so why python was chosen by Google to make TensorFlow? Wouldn't it be better to make it over an language that can be compiled and not interpreted?

使用 Python 相比 C++ 等语言进行机器学习有哪些优势?

What are the advantages of using Python over a language like C++ for machine learning?

推荐答案

关于 TensorFlow 最重要的一点是,在大多数情况下,核心不是用 Python 编写的:它是用 Python 编写的结合高度优化的 C++ 和 CUDA(Nvidia 的 GPU 编程语言).反过来,大部分情况是通过使用 Eigen(一种高性能 C++和 CUDA 数值库)和 NVidia 的 cuDNN(一个非常优化的 DNN 库,用于 NVidia GPU,用于诸如 卷积).

The most important thing to realize about TensorFlow is that, for the most part, the core is not written in Python: It's written in a combination of highly-optimized C++ and CUDA (Nvidia's language for programming GPUs). Much of that happens, in turn, by using Eigen (a high-performance C++ and CUDA numerical library) and NVidia's cuDNN (a very optimized DNN library for NVidia GPUs, for functions such as convolutions).

TensorFlow 的模型是程序员使用某种语言"(很可能是 Python!)来表达模型.此模型使用 TensorFlow 结构编写,例如:

The model for TensorFlow is that the programmer uses "some language" (most likely Python!) to express the model. This model, written in the TensorFlow constructs such as:

h1 = tf.nn.relu(tf.matmul(l1, W1) + b1)
h2 = ...

在 Python 运行时实际上并未执行.相反,实际创建的是一个数据流图,表示接受特定输入、应用特定操作、提供结果作为其他操作的输入,等等.这个模型由快速的 C++ 代码执行,并且在大多数情况下,操作之间的数据永远不会复制回 Python 代码.

is not actually executed when the Python is run. Instead, what's actually created is a dataflow graph that says to take particular inputs, apply particular operations, supply the results as the inputs to other operations, and so on. This model is executed by fast C++ code, and for the most part, the data going between operations is never copied back to the Python code.

然后程序员通过拉动节点来驱动"这个模型的执行——用于训练,通常使用 Python,以及服务,有时使用 Python,有时使用原始 C++:

Then the programmer "drives" the execution of this model by pulling on nodes -- for training, usually in Python, and for serving, sometimes in Python and sometimes in raw C++:

sess.run(eval_results)

这个 Python(或 C++ 函数调用)使用对 C++ 的进程内调用或 RPC 用于分布式版本调用 C++ TensorFlow 服务器告诉它执行,然后将结果复制回来.

This one Python (or C++ function call) uses either an in-process call to C++ or an RPC for the distributed version to call into the C++ TensorFlow server to tell it to execute, and then copies back the results.

那么,说了这么多,让我们重新表述这个问题:为什么 TensorFlow 选择 Python 作为第一个支持良好的语言来表达和控制模型的训练?

答案很简单:对于大量数据科学家和机器学习专家来说,Python 可能是最舒适的语言,它也很容易集成并控制 C++ 后端,同时是通用的,在 Google 内部和外部都被广泛使用,并且是开源的.考虑到 TensorFlow 的基本模型,Python 的性能不是那么重要,它是天作之合.NumPy 使在 Python 中进行预处理变得容易——而且具有高性能——这也是一个巨大的优势——- 在将其输入 TensorFlow 以处理真正占用大量 CPU 的工作之前.

The answer to that is simple: Python is probably the most comfortable language for a large range of data scientists and machine learning experts that's also that easy to integrate and have control a C++ backend, while also being general, widely-used both inside and outside of Google, and open source. Given that with the basic model of TensorFlow, the performance of Python isn't that important, it was a natural fit. It's also a huge plus that NumPy makes it easy to do pre-processing in Python -- also with high performance -- before feeding it in to TensorFlow for the truly CPU-heavy things.

在表达模型时也有很多复杂性,在执行它时不使用——形状推断(例如,如果你做 matmul(A, B),结果数据的形状是什么?)和自动梯度 计算.事实证明,能够用 Python 表达这些是件好事,尽管我认为从长远来看,它们可能会转移到 C++ 后端,以便更轻松地添加其他语言.

There's also a bunch of complexity in expressing the model that isn't used when executing it -- shape inference (e.g., if you do matmul(A, B), what is the shape of the resulting data?) and automatic gradient computation. It turns out to have been nice to be able to express those in Python, though I think in the long term they'll probably move to the C++ backend to make adding other languages easier.

(当然,希望将来支持其他语言来创建和表达模型.使用其他几种语言运行推理已经很简单了——C++ 现在可以工作了,来自 Facebook 的某人贡献了 Go 我们正在审查的绑定等)

(The hope, of course, is to support other languages in the future for creating and expressing models. It's already quite straightforward to run inference using several other languages -- C++ works now, someone from Facebook contributed Go bindings that we're reviewing now, etc.)

这篇关于TensorFlow,为什么选择 Python 语言?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆