Python/Django比较和更新模型对象 [英] Python / Django compare and update model objects

查看:38
本文介绍了Python/Django比较和更新模型对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

IV只是刚开始使用python,但在过去的几个月里学到了很多东西,现在我碰壁了关于以良好的速度更新模型上的对象的事情.

Iv only just started python but have learned a lot over the last few month, now I have hit a wall about updating objects on a model at a good speed.

我有一个称为Products的模型,该模型是从一个csv文件填充的,每天该文件都会随着成本和数量等变化而更新,我可以将文件的每一行与Products Model进行比较,但是需要12万行3-4小时.

I have a model called Products and this is populated from a csv file, every day this file get updated with changes like cost, and quantity, I can compare each line of the file with the Products Model but having 120k lines this takes 3-4hours.

我可以采取什么流程来加快此文件的运行速度.我只想在成本和数量发生变化的情况下修改对象

What process can I take to make this process this file faster. I only want to modify the objects if cost and quantity have changed

关于我如何解决此问题的任何建议?

Any suggestions how I tackle this?

我尝试过的版本3.

from django.core.management import BaseCommand
from multiprocessing import Pool
from django.contrib.auth.models import User
from pprint import pprint  
from CentralControl.models import Product, Supplier
from CentralControl.management.helpers.map_ingram import *
from CentralControl.management.helpers.helper_generic import *
from tqdm import tqdm

from CentralControl.management.helpers.config import get_ingram
import os, sys, csv, zipfile, CentralControl

# Run Script as 'SYSTEM'
user = User.objects.get(id=1)

# Get Connection config.   
SUPPLIER_CODE, FILE_LOCATION, FILE_NAME = get_ingram()


class Command(BaseCommand):
def handle(self, *args, **options):
    list_in = get_file()
    list_current = get_current_list()

    pool = Pool(6)

    pool.map(compare_lists(list_in, list_current))

    pool.close()



def compare_lists(list_in, list_current):
    for row_current in tqdm(list_current):
         for row_in in list_in:
             if row_in['order_code'] == row_current['order_code']:

                #do more stuff here.

                pass

def get_current_list():
    try:
        supplier = Supplier.objects.get(code='440040')
        current_list = Product.objects.filter(supplier=supplier).values()
        return current_list
    except:
        print('Error no products with supplier')
        exit()


def get_file():
    with zipfile.ZipFile(FILE_LOCATION + 'incoming/' + FILE_NAME, 'r') as zip:
    with zip.open('228688 .csv') as csvfile:
        reader = csv.DictReader(csvfile)
        list_in = (list(reader))

        for row in tqdm(list_in):
            row['order_code'] = row.pop('Ingram Part Number')
            row['order_code'] = (row['order_code']).lstrip("0")
            row['name'] = row.pop('Ingram Part Description')
            row['description'] = row.pop('Material Long Description')
            row['mpn'] = row.pop('Vendor Part Number')
            row['gtin'] = row.pop('EANUPC Code')
            row['nett_cost'] = row.pop('Customer Price')
            row['retail_price'] = row.pop('Retail Price')
            row['qty_at_supplier'] = row.pop('Available Quantity')
            row['backorder_date'] = row.pop('Backlog ETA')
            row['backorder_date'] = (row['backorder_date'])
            row['backorder_qty'] = row.pop('Backlog Information')

        zip.close()
        #commented out for dev precess.
        #os.rename(FILE_LOCATION + 'incoming/' + FILE_NAME, FILE_LOCATION + 'processed/' + FILE_NAME)
        return list_in

推荐答案

这是一个粗略的主意: 1,阅读csv时,按照@BearBrow的建议使用pandas进入array_csv 2,将obj数据从Django转换为Numpy Arrary array_obj 3,不要用numpy减法一一比较

Here it's rough idea: 1, when reading csv, use pandas as suggest by @BearBrow into array_csv 2, convert the obj data from Django into Numpy Arrary array_obj 3, don't compare them one by one , using numpy substraction

compare_index = (array_csv[['cost',['quantity']]] - array[['cost',['quantity']]] == 0)

4,找到更新的列 obj_need_updated = array_obj [np.logic_any(compare_index ['cost'],compare ['quantity'])]

4, find the updated column obj_need_updated = array_obj[np.logic_any(compare_index['cost'], compare['quantity'])]

然后使用Django批量更新 https://github.com/aykut/django-bulk-更新到批量更新

then use Django bulk update https://github.com/aykut/django-bulk-update to bulk update

希望这会提示您加快代码的速度

Hope this will give you hints to speed up your code

这篇关于Python/Django比较和更新模型对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆