全局解释器锁定和数据访问(例如,用于NumPy数组) [英] Global Interpreter Lock and access to data (eg. for NumPy arrays)

查看:168
本文介绍了全局解释器锁定和数据访问(例如,用于NumPy数组)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写Python的C扩展,该扩展应在对数据进行操作时释放Global Interpreter Lock.我想我对GIL的机制相当了解,但是仍然存在一个问题:当线程不拥有GIL时,我可以访问Python对象中的数据吗?例如,我想从C函数中的(大)NumPy数组中读取数据,而我仍然希望允许其他线程在其他CPU内核上执行其他操作. C函数应该

I am writing a C extension for Python, which should release the Global Interpreter Lock while it operates on data. I think I have understood the mechanism of the GIL fairly well, but one question remains: Can I access data in a Python object while the thread does not own the GIL? For example, I want to read data from a (big) NumPy array in the C function while I still want to allow other threads to do other things on the other CPU cores. The C function should

  • 使用Py_BEGIN_ALLOW_THREADS
  • 释放GIL
  • 无需使用Python函数即可读取和处理数据
  • 甚至将数据写入先前构造的NumPy数组
  • 通过Py_END_ALLOW_THREADS
  • 获取GIL
  • release the GIL with Py_BEGIN_ALLOW_THREADS
  • read and work on the data without using Python functions
  • even write data to previously constructed NumPy arrays
  • reacquire the GIL with Py_END_ALLOW_THREADS

这样安全吗?当然,其他线程不应更改C函数使用的变量.但是也许有一个隐藏的错误源:Python解释器可以移动对象,例如.通过某种垃圾回收,而C函数在单独的线程上对其进行处理?

Is this safe? Of course, other threads are not supposed to change the variables which the C function uses. But maybe there is one hidden source for errors: could the Python interpreter move an object, eg. by some sort of garbage collection, while the C function works on it in a separate thread?

为了用一个最小的例子说明这个问题,请考虑下面的(最小但完整的)代码. (在Linux上)使用

To illustrate the question with a minimal example, consider the (minimal but complete) code below. Compile it (on Linux) with

gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -fPIC -I/usr/lib/pymodules/python2.7/numpy/core/include -I/usr/include/python2.7 -c gilexample.c -o gilexample.o
gcc -pthread -shared gilexample.o -o gilexample.so

并在Python中使用

and test it in Python with

import gilexample
gilexample.sum([1,2,3])

Py_BEGIN_ALLOW_THREADSPy_END_ALLOW_THREADS之间的代码安全吗?它访问Python对象的内容,并且我不想在内存中复制(可能很大)数组.

Is the code between Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS safe? It accesses the contents of a Python object, and I do not want to duplicate the (possibly large) array in memory.

#include <Python.h>
#include <numpy/arrayobject.h>

// The relevant function
static PyObject * sum(PyObject * const self, PyObject * const args) {
  PyObject * X;
  PyArg_ParseTuple(args, "O", &X);
  PyObject const * const X_double = PyArray_FROM_OTF(X, NPY_DOUBLE, NPY_ALIGNED);
  npy_intp const size = PyArray_SIZE(X_double);
  double * const data = (double *) PyArray_DATA(X_double);
  double sum = 0;

  Py_BEGIN_ALLOW_THREADS // IS THIS SAFE?

  npy_intp i;
  for (i=0; i<size; i++)
    sum += data[i];

  Py_END_ALLOW_THREADS

  Py_DECREF(X_double);
  return PyFloat_FromDouble(sum);
}

// Python interface code
// List the C methods that this extension provides.
static PyMethodDef gilexampleMethods[] = {
  {"sum", sum, METH_VARARGS},
  {NULL, NULL, 0, NULL}     /* Sentinel - marks the end of this structure */
};

// Tell Python about these methods.
PyMODINIT_FUNC initgilexample(void)  {
  (void) Py_InitModule("gilexample", gilexampleMethods);
  import_array();  // Must be present for NumPy.
}

推荐答案

这样安全吗?

Is this safe?

严格地,不.我认为您应该将调用移至无GIL块之外的PyArray_SIZEPyArray_DATA处;如果这样做,将仅对C数据进行操作.您可能还想在进入无GIL的块之前增加对象的引用计数,然后再减少它.

Strictly, no. I think you should move the calls to PyArray_SIZE and PyArray_DATA outside the GIL-less block; if you do that, you'll be operating on C data only. You might also want to increment the reference count on the object before going into the GIL-less block and decrement it afterwards.

编辑后,它应该是安全的.不要忘了以后减少引用计数.

After your edits, it should be safe. Don't forget to decrement the reference count afterwards.

这篇关于全局解释器锁定和数据访问(例如,用于NumPy数组)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆