调用`order_by`之后,Django Paginate中的元素重复 [英] Duplicate elements in Django Paginate after `order_by` call

查看:77
本文介绍了调用`order_by`之后,Django Paginate中的元素重复的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Django 1.7.7.

I'm using Django 1.7.7.

我想知道是否有人经历过这种情况.这是我的查询:

I'm wondering if anyone has experienced this. This is my query:

events = Event.objects.filter(
    Q(date__gt=my_date) | Q(date__isnull=True)
).filter(type__in=[...]).order_by('date')

当我尝试将其分页时

p = Paginator(events, 10)
p.count  # Gives 91

event_ids = []
for i in xrange(1, p.count / 10 + 2):
    event_ids += [i.id for i in p.page(i)]

print len(event_ids)  # Still 91
print len(set(event_ids))  # 75

我注意到,如果删除了.order_by,我不会得到任何重复.然后,我只尝试.order_byEvent.objects.all().order_by('date')进行重复操作.

I noticed that if I removed the .order_by, I don't get any duplicates. I then tried just .order_by with Event.objects.all().order_by('date') which gave no duplicates.

最后,我尝试了这个:

events = Event.objects.filter(
    Q(date__gt=my_date) | Q(date__isnull=True)
).order_by('date')

p = Paginator(events, 10)
events.count()  # Gives 131
p.count  # Gives 131

event_ids = []
for i in xrange(1, p.count / 10 + 2):
    event_ids += [i.id for i in p.page(i)]

len(event_ids)  # Gives 131
len(set(event_ids))  # Gives 118

...,并且有重复项.谁能解释发生了什么事?

... and there are duplicates. Can anyone explain what's going on?

我研究了Django源代码( https://github. com/django/django/blob/master/django/core/paginator.py#L46-L55 ),这似乎与Django如何切片object_list有关.

I dug into the Django source (https://github.com/django/django/blob/master/django/core/paginator.py#L46-L55) and it seems to be something to do with how Django slices the object_list.

感谢您的帮助.谢谢.

distinct()对重复项没有影响.数据库中没有任何重复项,并且我不认为查询会引入任何重复项([e for e in events.iterator()]不会产生任何重复项).只是分页器切片时.

distinct() has no affect on the duplicates. There aren't any duplicates in the database and I don't think the query introduces any duplicates ([e for e in events.iterator()] doesn't produce any duplicates). It's just when the Paginator is slicing.

Edit2:这是一个更完整的示例

Here's a more complete example

In [1]: from django.core.paginator import Paginator

In [2]: from datetime import datetime, timedelta

In [3]: my_date = timezone.now()

In [4]:   1 events = Event.objects.filter(
          2     Q(date__gt=my_date) | Q(date__isnull=True)
          3 ).order_by('date')

In [5]: events.count()
Out[5]: 134

In [6]: p = Paginator(events, 10)

In [7]: p.count
Out[7]: 134

In [8]: event_ids = []

In [9]:   1 for i in xrange(1, p.num_pages + 1):
          2     event_ids += [j.id for j in p.page(i)]

In [10]: len(event_ids)
Out[10]: 134

In [11]: len(set(event_ids))
Out[11]: 115

推荐答案

哦,在黑暗中拍摄,但是我想我可能知道这是什么.我无法在sqlite中重现它,但使用mysql.我认为mysql试图对具有相同值的列进行排序,使其在切片过程中返回相同的结果

oh, shot in the dark, but i think i might know what it is. i wasn't able to reproduce it in sqlite but using mysql. i think mysql trying to sort on a column that has the same value has it returning the same results during slicing

分页拼接基本上是执行sql语句 SELECT ... FROM ... WHERE (date > D OR date IS NULL) ORDER BY date ASC LIMIT X OFFSET X

the pagination splicing basically does an sql statement of SELECT ... FROM ... WHERE (date > D OR date IS NULL) ORDER BY date ASC LIMIT X OFFSET X

但是当date为null时,我不确定mysql如何对其进行排序.因此,当我尝试两个LIMIT 10和LIMIT 10 OFFSET 10的SQL查询时,它返回的行具有相同的行,而LIMIT 20产生一个唯一的行.

But when date is null I'm not sure how mysql sorts it. So when I tried two sql queries of LIMIT 10 and LIMIT 10 OFFSET 10 it returned sets that had the same rows, while LIMIT 20 produce a unique set.

您可以尝试将order_by更新为 order_by('id','date'),以使其首先按唯一字段进行排序,并且可能会对其进行修复.

you can try to update your order_by to order_by('id', 'date') to have it sort by a unique field first and it may fix it.

这篇关于调用`order_by`之后,Django Paginate中的元素重复的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆