使用 MongoDB 管理 Python 多处理 [英] Manage Python Multiprocessing with MongoDB

查看：52 发布时间：2021/6/3 20:02:07 python mongodb python-2.7 pymongo python-multiprocessing

本文介绍了使用 MongoDB 管理 Python 多处理的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用多处理函数运行我的代码，但 mongo 不断返回

I'm trying to run my code with a multiprocessing function but mongo keep returning

"MongoClient 在 fork 之前打开.创建 MongoClientconnect=False，或者在分叉后创建客户端."

"MongoClient opened before fork. Create MongoClient with connect=False, or create client after forking."

我真的不明白如何使我的代码适应这个.基本上结构是:

I really doesn't understand how i can adapt my code to this. Basically the structure is:

db = MongoClient().database
db.authenticate('user', 'password', mechanism='SCRAM-SHA-1')
collectionW = db['words']
collectionT = db['sinMemo']
collectionL = db['sinLogic']


def findW(word):
    rows = collectionw.find({"word": word})
    ind = 0
    for row in rows:
        ind += 1
        id = row["_id"]

    if ind == 0:
        a = ind
    else:
        a = id
    return a



def trainAI(stri):
...
      if findW(word) == 0:

                _id = db['words'].insert(
                    {"_id": getNextSequence(db.counters, "nodeid"), "word": word})
                story = _id
            else:
                story = findW(word)
...


def train(index):
    # searching progress
    progFile = "./train/progress{0}.txt".format(index)
    trainFile = "./train/small_file_{0}".format(index)
    if os.path.exists(progFile):
        f = open(progFile, "r")
        ind = f.read().strip()
        if ind != "":

            pprint(ind)
            i = int(ind)
        else:
            pprint("No progress saved or progress lost!")
            i = 0
        f.close()

    else:
        i = 0
    #get the number of line of the file    
    rangeC = rawbigcount(trainFile)

    #fix unicode
    non_bmp_map = dict.fromkeys(range(0x10000, sys.maxunicode + 1), 0xfffd)
    files = io.open(trainFile, "r", encoding="utf8")
    str1 = ""
    str2 = ""

    filex = open(progFile, "w")

    with progressbar.ProgressBar(max_value=rangeC) as bar:
        for line in files:
            line = line.replace("\n", "")
            if i % 2 == 0:
                str1 = line.translate(non_bmp_map)
            else:
                str2 = line.translate(non_bmp_map)

            bar.update(i)
            trainAI(str1 + " " + str2)
            filex.seek(0)
            filex.truncate()
            filex.write(str(i))
            i += 1

#multiprocessing function

maxProcess = 3

def f(l, i):
    l.acquire()
    train(i + 1)
    l.release()

if __name__ == '__main__':
    lock = Lock()

    for num in range(maxProcess):
        pprint("start " + str(num))
        Process(target=f, args=(lock, num)).start()

此代码用于在 4 个不同的进程中读取 4 个不同的文件，同时将数据插入数据库中.我只复制了部分代码，让你了解它的结构.

This code is made for reading 4 different file in 4 different process and at the same time insert the data in the database. I copied only part of the code for make you understand the structure of it.

我尝试在此代码中添加 connect=False 但什么都没有...

I've tried to add connect=False to this code but nothing...

  db = MongoClient(connect=False).database
  db.authenticate('user', 'password', mechanism='SCRAM-SHA-1')
  collectionW = db['words']
  collectionT = db['sinMemo']
  collectionL = db['sinLogic']

然后我试图将它移动到 f 函数中(就在 train() 之前，但我得到的是该程序没有找到 collectionW、collectionT 和 collectionL.

then i've tried to move it in the f function (right before train() but what i get is that the program doesn't find collectionW,collectionT and collectionL.

我不是 Python 或 mongodb 的专家，所以我希望这不是一个愚蠢的问题.

I'm not very expert of python or mongodb so i hope that this is not a silly question.

代码在 Ubuntu 16.04.2 和 python 2.7.12 下运行

The code is running under Ubuntu 16.04.2 with python 2.7.12

使用 MongoDB 管理 Python 多处理 [英] Manage Python Multiprocessing with MongoDB

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用 MongoDB 管理 Python 多处理 [英] Manage Python Multiprocessing with MongoDB

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭