调用`order_by`之后,Django Paginate中的元素重复 [英] Duplicate elements in Django Paginate after `order_by` call
问题描述
我正在使用Django 1.7.7.
I'm using Django 1.7.7.
我想知道是否有人经历过这种情况.这是我的查询:
I'm wondering if anyone has experienced this. This is my query:
events = Event.objects.filter(
Q(date__gt=my_date) | Q(date__isnull=True)
).filter(type__in=[...]).order_by('date')
当我尝试将其分页时
p = Paginator(events, 10)
p.count # Gives 91
event_ids = []
for i in xrange(1, p.count / 10 + 2):
event_ids += [i.id for i in p.page(i)]
print len(event_ids) # Still 91
print len(set(event_ids)) # 75
我注意到,如果删除了.order_by
,我不会得到任何重复.然后,我只尝试.order_by
和Event.objects.all().order_by('date')
进行重复操作.
I noticed that if I removed the .order_by
, I don't get any duplicates. I then tried just .order_by
with Event.objects.all().order_by('date')
which gave no duplicates.
最后,我尝试了这个:
events = Event.objects.filter(
Q(date__gt=my_date) | Q(date__isnull=True)
).order_by('date')
p = Paginator(events, 10)
events.count() # Gives 131
p.count # Gives 131
event_ids = []
for i in xrange(1, p.count / 10 + 2):
event_ids += [i.id for i in p.page(i)]
len(event_ids) # Gives 131
len(set(event_ids)) # Gives 118
...,并且有重复项.谁能解释发生了什么事?
... and there are duplicates. Can anyone explain what's going on?
我研究了Django源代码( https://github. com/django/django/blob/master/django/core/paginator.py#L46-L55 ),这似乎与Django如何切片object_list
有关.
I dug into the Django source (https://github.com/django/django/blob/master/django/core/paginator.py#L46-L55) and it seems to be something to do with how Django slices the object_list
.
感谢您的帮助.谢谢.
distinct()
对重复项没有影响.数据库中没有任何重复项,并且我不认为查询会引入任何重复项([e for e in events.iterator()]
不会产生任何重复项).只是分页器切片时.
distinct()
has no affect on the duplicates. There aren't any duplicates in the database and I don't think the query introduces any duplicates ([e for e in events.iterator()]
doesn't produce any duplicates). It's just when the Paginator is slicing.
Edit2:这是一个更完整的示例
Here's a more complete example
In [1]: from django.core.paginator import Paginator
In [2]: from datetime import datetime, timedelta
In [3]: my_date = timezone.now()
In [4]: 1 events = Event.objects.filter(
2 Q(date__gt=my_date) | Q(date__isnull=True)
3 ).order_by('date')
In [5]: events.count()
Out[5]: 134
In [6]: p = Paginator(events, 10)
In [7]: p.count
Out[7]: 134
In [8]: event_ids = []
In [9]: 1 for i in xrange(1, p.num_pages + 1):
2 event_ids += [j.id for j in p.page(i)]
In [10]: len(event_ids)
Out[10]: 134
In [11]: len(set(event_ids))
Out[11]: 115
推荐答案
哦,在黑暗中拍摄,但是我想我可能知道这是什么.我无法在sqlite中重现它,但使用mysql.我认为mysql试图对具有相同值的列进行排序,使其在切片过程中返回相同的结果
oh, shot in the dark, but i think i might know what it is. i wasn't able to reproduce it in sqlite but using mysql. i think mysql trying to sort on a column that has the same value has it returning the same results during slicing
分页拼接基本上是执行sql语句
SELECT ... FROM ... WHERE (date > D OR date IS NULL) ORDER BY date ASC LIMIT X OFFSET X
the pagination splicing basically does an sql statement of
SELECT ... FROM ... WHERE (date > D OR date IS NULL) ORDER BY date ASC LIMIT X OFFSET X
但是当date为null时,我不确定mysql如何对其进行排序.因此,当我尝试两个LIMIT 10和LIMIT 10 OFFSET 10的SQL查询时,它返回的行具有相同的行,而LIMIT 20产生一个唯一的行.
But when date is null I'm not sure how mysql sorts it. So when I tried two sql queries of LIMIT 10 and LIMIT 10 OFFSET 10 it returned sets that had the same rows, while LIMIT 20 produce a unique set.
您可以尝试将order_by更新为 order_by('id','date'),以使其首先按唯一字段进行排序,并且可能会对其进行修复.
you can try to update your order_by to order_by('id', 'date') to have it sort by a unique field first and it may fix it.
这篇关于调用`order_by`之后,Django Paginate中的元素重复的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!