使用pyodbc的Python多处理和数据库访问“不安全"吗? [英] Python multiprocessing and database access with pyodbc "is not safe"?
问题描述
问题:
我得到以下回溯,但不了解它的含义或解决方法:
I am getting the following traceback and don't understand what it means or how to fix it:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Python26\lib\multiprocessing\forking.py", line 342, in main
self = load(from_parent)
File "C:\Python26\lib\pickle.py", line 1370, in load
return Unpickler(file).load()
File "C:\Python26\lib\pickle.py", line 858, in load
dispatch[key](self)
File "C:\Python26\lib\pickle.py", line 1083, in load_newobj
obj = cls.__new__(cls, *args)
TypeError: object.__new__(pyodbc.Cursor) is not safe, use pyodbc.Cursor.__new__()
情况:
我有一个SQL Server数据库,其中充满了要处理的数据.我正在尝试使用多处理模块来并行化工作并利用计算机上的多个内核.我的一般类结构如下:
I've got a SQL Server database full of data to be processed. I'm trying to use the multiprocessing module to parallelize the work and take advantage of the multiple cores on my computer. My general class structure is as follows:
- MyManagerClass
- 这是程序启动的主要类别.
- 它创建两个multiprocessing.Queue对象,一个
work_queue
和一个write_queue
- 它还会创建并启动其他进程,然后等待它们完成.
- 注意:这不是multiprocessing.managers.BaseManager()的扩展
- MyManagerClass
- This is the main class, where the program starts.
- It creates two multiprocessing.Queue objects, one
work_queue
and onewrite_queue
- It also creates and launches the other processes, then waits for them to finish.
- NOTE: this is not an extension of multiprocessing.managers.BaseManager()
- 此类从SQL Server数据库读取数据.
- 它将项目放入
work_queue
.
- 这是进行工作处理的地方.
- 它从
work_queue
获取项目,并将完成的项目放入write_queue
.
- This is where the work processing happens.
- It gets items from the
work_queue
and puts completed items in thewrite_queue
.
- 此类负责将处理后的数据写回到SQL Server数据库中.
- 它从
write_queue
获取物品.
- This class is in charge of writing the processed data back to the SQL Server database.
- It gets items from the
write_queue
.
想法是将有一名经理,一名读者,一名作家和许多工人.
The idea is that there will be one manager, one reader, one writer, and many workers.
其他详细信息:
我在stderr中两次获得了回溯,所以我认为它对读者来说一次,对作家来说一次.我的工作进程可以很好地创建,但是请坐在那里直到我发送KeyboardInterrupt,因为
work_queue
中没有任何内容.I get the traceback twice in stderr, so I'm thinking that it happens once for the reader and once for the writer. My worker processes get created fine, but just sit there until I send a KeyboardInterrupt because they have nothing in the
work_queue
.读取器和写入器都有自己的数据库连接,都是在初始化时创建的.
Both the reader and writer have their own connection to the database, created on initialization.
解决方案:
感谢Mark和Ferdinand Beyer的回答和提出此解决方案的问题.他们正确地指出Cursor对象不是可刺的",这是多处理用来在进程之间传递信息的方法.
Thanks to Mark and Ferdinand Beyer for their answers and questions that led to this solution. They rightfully pointed out that the Cursor object is not "pickle-able", which is the method that multiprocessing uses to pass information between processes.
我的代码存在的问题是
MyReaderClass(multiprocessing.Process)
和MyWriterClass(multiprocessing.Process)
都通过__init__()
方法连接到数据库.我在MyManagerClass
中创建了这两个对象(即称为其init方法),然后称为start()
.The issue with my code was that
MyReaderClass(multiprocessing.Process)
andMyWriterClass(multiprocessing.Process)
both connected to the database in their__init__()
methods. I created both these objects (i.e. called their init method) inMyManagerClass
, then calledstart()
.因此它将创建连接和游标对象,然后尝试通过pickle将它们发送给子进程.我的解决方案是将连接和游标对象的实例移动到run()方法,直到完全创建子进程后才调用该方法.
So it would create the connection and cursor objects, then try to send them to the child process via pickle. My solution was to move the instantiation of the connection and cursor objects to the run() method, which isn't called until the child process is fully created.
推荐答案
多处理依赖于酸洗在进程之间通信对象.不能腌制pyodbc连接和光标对象.
Multiprocessing relies on pickling to communicate objects between processes. The pyodbc connection and cursor objects can not be pickled.
>>> cPickle.dumps(aCursor) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib64/python2.5/copy_reg.py", line 69, in _reduce_ex raise TypeError, "can't pickle %s objects" % base.__name__ TypeError: can't pickle Cursor objects >>> cPickle.dumps(dbHandle) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib64/python2.5/copy_reg.py", line 69, in _reduce_ex raise TypeError, "can't pickle %s objects" % base.__name__ TypeError: can't pickle Connection objects
将项目放入work_queue",哪些项目?游标对象是否也有可能通过?
"It puts items in the work_queue", what items? Is it possible the cursor object is getting passed as well?
这篇关于使用pyodbc的Python多处理和数据库访问“不安全"吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!