使用pyodbc的Python多处理和数据库访问“不安全"吗? [英] Python multiprocessing and database access with pyodbc "is not safe"?

查看:95
本文介绍了使用pyodbc的Python多处理和数据库访问“不安全"吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题:

我得到以下回溯,但不了解它的含义或解决方法:

I am getting the following traceback and don't understand what it means or how to fix it:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Python26\lib\multiprocessing\forking.py", line 342, in main
    self = load(from_parent)
  File "C:\Python26\lib\pickle.py", line 1370, in load
    return Unpickler(file).load()
  File "C:\Python26\lib\pickle.py", line 858, in load
    dispatch[key](self)
  File "C:\Python26\lib\pickle.py", line 1083, in load_newobj
    obj = cls.__new__(cls, *args)
TypeError: object.__new__(pyodbc.Cursor) is not safe, use pyodbc.Cursor.__new__()

情况:

我有一个SQL Server数据库,其中充满了要处理的数据.我正在尝试使用多处理模块来并行化工作并利用计算机上的多个内核.我的一般类结构如下:

I've got a SQL Server database full of data to be processed. I'm trying to use the multiprocessing module to parallelize the work and take advantage of the multiple cores on my computer. My general class structure is as follows:

  • MyManagerClass
    • 这是程序启动的主要类别.
    • 它创建两个multiprocessing.Queue对象,一个work_queue和一个write_queue
    • 它还会创建并启动其他进程,然后等待它们完成.
    • 注意:这不是multiprocessing.managers.BaseManager()的扩展
    • MyManagerClass
      • This is the main class, where the program starts.
      • It creates two multiprocessing.Queue objects, one work_queue and one write_queue
      • It also creates and launches the other processes, then waits for them to finish.
      • NOTE: this is not an extension of multiprocessing.managers.BaseManager()
      • 此类从SQL Server数据库读取数据.
      • 它将项目放入work_queue.
      • 这是进行工作处理的地方.
      • 它从work_queue获取项目,并将完成的项目放入write_queue.
      • This is where the work processing happens.
      • It gets items from the work_queue and puts completed items in the write_queue.
      • 此类负责将处理后的数据写回到SQL Server数据库中.
      • 它从write_queue获取物品.
      • This class is in charge of writing the processed data back to the SQL Server database.
      • It gets items from the write_queue.

      想法是将有一名经理,一名读者,一名作家和许多工人.

      The idea is that there will be one manager, one reader, one writer, and many workers.

      其他详细信息:

      我在stderr中两次获得了回溯,所以我认为它对读者来说一次,对作家来说一次.我的工作进程可以很好地创建,但是请坐在那里直到我发送KeyboardInterrupt,因为work_queue中没有任何内容.

      I get the traceback twice in stderr, so I'm thinking that it happens once for the reader and once for the writer. My worker processes get created fine, but just sit there until I send a KeyboardInterrupt because they have nothing in the work_queue.

      读取器和写入器都有自己的数据库连接,都是在初始化时创建的.

      Both the reader and writer have their own connection to the database, created on initialization.

      解决方案:

      感谢Mark和Ferdinand Beyer的回答和提出此解决方案的问题.他们正确地指出Cursor对象不是可刺的",这是多处理用来在进程之间传递信息的方法.

      Thanks to Mark and Ferdinand Beyer for their answers and questions that led to this solution. They rightfully pointed out that the Cursor object is not "pickle-able", which is the method that multiprocessing uses to pass information between processes.

      我的代码存在的问题是MyReaderClass(multiprocessing.Process)MyWriterClass(multiprocessing.Process)都通过__init__()方法连接到数据库.我在MyManagerClass中创建了这两个对象(即称为其init方法),然后称为start().

      The issue with my code was that MyReaderClass(multiprocessing.Process) and MyWriterClass(multiprocessing.Process) both connected to the database in their __init__() methods. I created both these objects (i.e. called their init method) in MyManagerClass, then called start().

      因此它将创建连接和游标对象,然后尝试通过pickle将它们发送给子进程.我的解决方案是将连接和游标对象的实例移动到run()方法,直到完全创建子进程后才调用该方法.

      So it would create the connection and cursor objects, then try to send them to the child process via pickle. My solution was to move the instantiation of the connection and cursor objects to the run() method, which isn't called until the child process is fully created.

      推荐答案

      多处理依赖于酸洗在进程之间通信对象.不能腌制pyodbc连接和光标对象.

      Multiprocessing relies on pickling to communicate objects between processes. The pyodbc connection and cursor objects can not be pickled.

      >>> cPickle.dumps(aCursor)
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "/usr/lib64/python2.5/copy_reg.py", line 69, in _reduce_ex
          raise TypeError, "can't pickle %s objects" % base.__name__
      TypeError: can't pickle Cursor objects
      >>> cPickle.dumps(dbHandle)
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "/usr/lib64/python2.5/copy_reg.py", line 69, in _reduce_ex
          raise TypeError, "can't pickle %s objects" % base.__name__
      TypeError: can't pickle Connection objects
      

      将项目放入work_queue",哪些项目?游标对象是否也有可能通过?

      "It puts items in the work_queue", what items? Is it possible the cursor object is getting passed as well?

      这篇关于使用pyodbc的Python多处理和数据库访问“不安全"吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆