MySQL INSERT ... ON DUPLICATE KEY UPDATE with django 1.4 for bulk insert [英] MySQL INSERT ... ON DUPLICATE KEY UPDATE with django 1.4 for bulk insert

查看:220
本文介绍了MySQL INSERT ... ON DUPLICATE KEY UPDATE with django 1.4 for bulk insert的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在pjango 1.4中,我遇到问题,找出MySQL INSERT ... ON DUPLICATE KEY UPDATE。



我试图插入记录的表有一个2列(复合)唯一键。我收到的记录来自第三方来源,值将随着时间而改变,除了那些使唯一密钥集合的字段之外。我一次收到1〜5k的记录,需要



目前我正在使用Model.objects.bulk_create批量插入,性能真的很棒,因为它发生无论记录集多大,一般一个查询。然而,由于我的记录可以随着时间的推移在第三方结束,我需要在记录集上执行MySQL INSERT ... ON DUPLICATE KEY UPDATE查询。



我正在计划编写原始的SQL语句,并使用如下所示的方式执行:

  sql =MySQL INSERT ... ON DUPLICATE KEY UPDATE

raw_insert(sql)

def raw_insert(sql):
from django。 db import connection,transaction
cursor = connection.cursor()

#数据修改操作 - 提交必需
cursor.execute(sql)
transaction.commit_unless_managed()

返回1

想知道是否有更好的解决方案。另外我将如何清理原始插入的字段值?

解决方案

所以我创建了一个自定义管理器。这是经理:

  class BulkInsertManager(models.Manager):
def _bulk_insert_or_update(self,create_fields,update_fields,值)

从django.db导入连接,事务
cursor = connection.cursor()

db_table = self.model._meta.db_table

values_sql = []
values_data = []

值中的value_lists:
values_sql.append((%s)%(',') join([%sfor i in range(len(value_lists))])))
values_data.extend(value_lists)

base_sql =INSERT INTO%s(%s )VALUES%(db_table,,。join(create_fields))

on_duplicates = []

在update_fields中的字段:
on_duplicates.append += VALUES(+ field +))

sql =%s%s ON DUPLICATE KEY UPDATE%s%(base_sql,,.join(values_sql) .join(on_duplicates))

cursor.e xecutemany(sql,[values_data])
transaction.commit_unless_managed()



  class User_Friend(models.Model):
objects = BulkInsertManager()#分配一个自定义管理器来处理批量插入

id = models.CharField(max_length = 255)
user = models.ForeignKey(User,null = False,blank = False)
first_name = models.CharField(max_length = 30)
last_name = models.CharField(max_length = 30)
city = models.CharField(max_length = 50,null = True,blank = True)
province = models.CharField(max_length = 50,null = True,blank = True)
country = models.CharField(max_length = 30,null = True,blank = True)

和示例实现:

  def save_user_friends(用户,朋友):
user_friends = []
朋友的朋友:

create_fields = ['id','user_id','first_name','last_n ame','city','province','country']
update_fields = ['first_name','last_name','city','province','country']

user_friends.append(
[
str(user.id),
str(friend ['id']),
friend ['first_name'],
friend ['last_name'],
friend ['city'],
friend ['province'],
friend ['country'],
]


User_Friend.objects._bulk_insert_or_update(create_fields,update_fields,user_friends)

gist


I am having issues figuring out MySQL INSERT ... ON DUPLICATE KEY UPDATE with django 1.4.

The table that I am trying to insert records has a 2 column(composite) unique key. Records that I am receiving is from a 3rd party source and values will change over time except for those fields that makes the unique key set. I am receiving 1 ~ 5k records at a time, and would need to

Currently I am using Model.objects.bulk_create to bulk insert, performance is really amazing as it issues generally one query no matter how big the record set is. However, as my records can change over time on the 3rd party end, I need to perform the MySQL INSERT ... ON DUPLICATE KEY UPDATE query on the recordset.

I am planning to write raw SQL statements and execute using something like here:

sql = "MySQL INSERT ... ON DUPLICATE KEY UPDATE"

raw_insert(sql)

def raw_insert(sql):
    from django.db import connection, transaction
    cursor = connection.cursor()

    # Data modifying operation - commit required
    cursor.execute(sql)
    transaction.commit_unless_managed()

    return 1

Wondering if there is a better solution to my problem. Also how would I sanitize the field values for raw insert?

解决方案

So I created a custom manager. Here is the manager:

class BulkInsertManager(models.Manager):
    def _bulk_insert_or_update(self, create_fields, update_fields, values):

        from django.db import connection, transaction
        cursor = connection.cursor()

        db_table = self.model._meta.db_table

        values_sql = []
        values_data =[]

        for value_lists in values:
            values_sql.append( "(%s)" % (','.join([ "%s" for i in range(len(value_lists))]),) )
            values_data.extend(value_lists)

        base_sql = "INSERT INTO %s (%s) VALUES " % (db_table, ",".join(create_fields))

        on_duplicates = []

        for field in update_fields:
            on_duplicates.append(field + "=VALUES(" + field +")")

        sql = "%s %s ON DUPLICATE KEY UPDATE %s" % (base_sql, ", ".join(values_sql), ",".join(on_duplicates))

        cursor.executemany(sql, [values_data])
        transaction.commit_unless_managed()

And a sample model:

class User_Friend(models.Model):
    objects = BulkInsertManager() # assign a custom manager to handle bulk insert

    id = models.CharField(max_length=255)
    user = models.ForeignKey(User, null=False, blank=False)
    first_name = models.CharField(max_length=30)
    last_name = models.CharField(max_length=30)
    city = models.CharField(max_length=50, null=True, blank=True)
    province = models.CharField(max_length=50, null=True, blank=True)
    country =  models.CharField(max_length=30, null=True, blank=True)

And sample implementation:

def save_user_friends(user, friends):
    user_friends = []
    for friend in friends:

        create_fields = ['id', 'user_id', 'first_name', 'last_name', 'city', 'province', 'country']
        update_fields = ['first_name', 'last_name', 'city', 'province', 'country']

        user_friends.append(
            [
                str(user.id), 
                str(friend['id']),
                friend['first_name'],
                friend['last_name'],
                friend['city'],
                friend['province'],
                friend['country'],
            ]
        )

    User_Friend.objects._bulk_insert_or_update(create_fields, update_fields, user_friends)

Here is the gist.

这篇关于MySQL INSERT ... ON DUPLICATE KEY UPDATE with django 1.4 for bulk insert的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆