排序(和排序)不排序 [英] Sort (and sorted) not sorting

查看:94
本文介绍了排序(和排序)不排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个按以下方式构建的数据文件:

I have a data file that's built the following way:

source_id,target_id,展示次数,点击次数

source_id, target_id, impressions, clicks

在其中添加以下列:

  • pair-源和目标的元组
  • 点击率-基本上是点击/展示
  • 下界
  • 上界

上下限是计算值(与我的问题无关,但对于好奇值这些是威尔逊置信度的界限间隔.

Lower/Upper bound are calculated values (it's irrelevant to my question, but for the curious ones these are the bounds for the Wilson confidence interval.

问题是,我正在尝试按下限(位置= 6)对列表进行降序排列.尝试了几件事(排序/排序,使用lambda对比使用itemgetter,创建一个没有标题的新列表,然后尝试仅对其进行排序),但看起来没有任何变化.我有下面的代码.

The thing is, I'm trying to sort the list by the lower bound (position = 6), descending. Tried several things (sort/sorted, using lambda vs. using itemgetter, creating a new list w/o the header and try to sort just that) and still it appears nothing changes. I have the code below.

import csv
from math import sqrt
from operator import itemgetter

#----- Read CSV ----------------------------------------------------------------
raw_data_csv  = open('rawdile', "rb")
raw_reader = csv.reader(raw_data_csv)

#  transform the values to ints.
raw_data = []
for rownum,row in enumerate(list(raw_reader)):
    if rownum == 0:                                                             # Header
        raw_data.append(row)
    else:
        r = []                                                            # Col header
        r.extend([int(x) for x in row])                                     # Transforming the values to ints
        raw_data.append(r)



# Add cols for pairs (as tuple) and CTR
raw_data[0].append("pair")


for row in raw_data[1:]:
    row.append((row[0],row[1]))         # tuple
#    row.append(float(row[3])/row[2])    # CTR



# ------------------------------------------------------------------------------


z = 1.95996398454005


def confidence(n, clicks):

    if n == 0:
        return 0
    phat = float(clicks) / n
    l_bound = ((phat + z*z/(2*n) - z * sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n))        # lower bound
    u_bound = ((phat + z*z/(2*n) + z * sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n))        # upper bound
    return phat, l_bound, u_bound


raw_data[0].extend(["CTR","Lower Bound","Upper Bound"])


for row in raw_data[1:]:
    phat, l_bound, u_bound  = confidence(row[2],row[3])
    row.extend([phat, l_bound, u_bound])



# raw_data[1:].sort(key=lambda x: x[6], reverse=True) 

sorted(raw_data[1:], key=itemgetter(6), reverse=True)



outputfile= open('outputfile.csv', 'wb')
wr = csv.writer(outputfile,quoting = csv.QUOTE_ALL)

wr.writerows(raw_data)


raw_data_csv.close()
outputfile.close()

有人可以告诉原因吗? 谢谢!

Can anybody tell why? Thanks!

推荐答案

您正在一次尝试对切片进行排序(这将创建一个新的列表对象),而在另一次尝试中,您将忽略返回值sorted()的值.

You are sorting a slice in one attempt (which creates a new list object), and in your other attempt you ignore the return value of sorted().

您不能像这样对列表的一部分进行排序;通过串联创建一个新列表:

You cannot sort part of a list like that; create a new list by concatenating instead:

rows = rows[:1] + sorted(raw_data[1:], key=itemgetter(6), reverse=True)

这篇关于排序(和排序)不排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆