在Python中排序序列的最佳方法是什么？ [英] What is the best way to sort a sequence in Python?

查看：156 发布时间：2017/2/25 1:12:59 python csv conditional

本文介绍了在Python中排序序列的最佳方法是什么？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图根据需要在一行中发生的某些条件对表进行排序。
表格的简化版本：

...

我需要检查时间是否连续小于40次。像我需要检查行1-5，然后2-6等...然后打印并保存到文件的第一次和最后一次。喜欢，如果满足行2-6的条件，我将需要打印时间为2号和6号。检查应该在条件满足后停止。无需检查其他行。我实现了一个带两个临时变量的计数器，以检查到目前为止一行中的3个项目。它工作正常。但是，如果我想检查连续发生30次的条件，我不能手动创建30个临时变量。什么是最好的方法来实现呢？我想我只需要一种循环。

这是我的代码的一部分：

  csv.reader（open（filename））
计数器，temp1，temp2，numrow = 0，0，0，0 
 
读取器中的行：
 numrow + = 1 
 if numrow< 5：
 col0，col1，col4，col5，col6，col23，col24，col25 = float（row [0]），
 float（row [1]），float row [24]），float（row [24]），float（row [5]），float b $ b如果col1 <= 40：
 list1 =（col1，col3，col4，col5，col6，col23，col24，col25）
计数器+ = 1 
如果counter == 3 ：
 print（Cell＃％s％filename [-10：-5]）
 print LAYOUT.format（* headers_short）
 print LAYOUT.format（* temp1）
 print LAYOUT.format（* temp2）
 print LAYOUT.format（* list1）
 print
 
 elif计数器== 1：
 temp1 = list1 
 
 elif counter == 2：
 temp2 = list1 
 
 else：
 counter = 0 
  pre> 
 
 我实现了Bakuriu建议的解决方案，似乎工作。但是，什么是结合众多测试的最好方法？像我需要检查几个条件。让我们说：
v 
 
 
  
 连续10个周期内效率低于40，
 
 <  
  
  
 的时间少于40次，共有25个周期。
 。 
 
 
 
 现在我只需打开csv.reader并运行该函数。我想这不是最有效的方式，虽然它的工作。对不起，我只是一个完整的noob。
  csvfiles = glob.glob（'processed_data / *。stat'）
在csvfiles中的文件名：
 
 flag = [] 
 flag.append（filename [-12：-5]）
 reader = csv.reader（open 
 for a，row_group in enumerate（row_grouper（reader，10））：
如果all（float（row [1]）<40，row_group中的行）：
 str1 =Efficiency在周期中小于40+ str（a + 1）+' - '+ str（a + 10）#i是组中第一行的索引。 
 flag.append（str1）
 break #stop处理其他行。 
 
 reader = csv.reader（open（filename））
 for b，row_group in enumerate（row_grouper（reader，5））：
 if all ]）< 40 for row in row_group）：
 str1 =容量小于40分钟的周期+ str（a + 1）+' - '+ str（a + 5）
 .append（str1）
 break #stop处理其他行。 
 
 reader = csv.reader（open（filename））
 for b，row_group in enumerate（row_grouper（reader，25））：
 if all ]）<40 for row in row_group）：
 str1 =时间小于周期数<40+ str（a + 1）+' - '+ str（a + 25）
 flag.append（str1）
 break #stop处理其他行。 
 
 if len（flag）> 1：
 
 for i in flag：
 print i 
 print'\\\
'
  
 
 
解决方案
您不需要对数据进行排序，您要查找的条件是否发生在最后的 N 行数据中。固定大小的  collections.deque  对这类事物有好处。
  import csv 
 from collections import deque 
 filename ='table.csv'
 GROUP_SIZE = 5 
 THRESHOLD = 40 
 cond_deque = deque（maxlen = GROUP_SIZE）
 
数据文件：
 reader = csv.reader（datafile）＃assume delimiter ='，'
 reader.next（）＃skip header row 
 for linenum，enumerate ）：＃处理文件的行
 i in（0,1,4,5）中的col0，col1，col4，col5，col6，col23，col24，col25 =（
 float（row [i] ，6，23，24，25））
 cond_deque.append（col1  if cond_deque.count（True）== GROUP_SIZE：
 print'lines {}  -  {}具有{}连续行，col1 < {}'。格式（
 linenum-GROUP_SIZE + 1，linenum，GROUP_SIZE，THRESHOLD）
发现break＃，因此停止查找
  
 
I am trying to sort the table based on certain conditions that need to happen in a row.
Simplified version of a table:
Number  Time
   1    23
   2    45
   3    67
   4    23
   5    11
   6    45
   7    123
   8    34
...

I need to check if time was <40 5 times in a row. Like I need to check rows 1-5, then 2-6 etc... And then print and save to a file the first and last time. Like, if the condition is met for rows 2-6 I will need to print time for Number 2 and Number 6. The checking should stop after condition has been met. No need to  check other rows.  I implemented a counter with two temp variables to check for 3 items in a row so far. It works fine. But, if I want to check for the condition that happened 30 times in a row, I can not just create 30 temp variables manually. What is the best way to achieve that? I guess I will just need some kind of a loop. Thanks!

Here is part of my code:
reader = csv.reader(open(filename))
counter, temp1, temp2, numrow = 0, 0, 0, 0

for row in reader:
    numrow+=1
    if numrow <5:
        col0, col1, col4, col5, col6, col23, col24, col25 = float(row[0]),
            float(row[1]), float(row[4]), float(row[5]),float(row[6]), 
            float(row[23]), float(row[24]), float(row[25])
        if col1 <= 40:
            list1=(col1, col3, col4, col5, col6, col23, col24, col25)
            counter += 1
            if counter == 3:
                print("Cell# %s" %filename[-10:-5])
                print LAYOUT.format(*headers_short)
                print LAYOUT.format(*temp1)
                print LAYOUT.format(*temp2)
                print LAYOUT.format(*list1)
                print ""

            elif counter == 1:
                temp1=list1

            elif counter == 2:
                temp2=list1

        else:
            counter = 0
I implemented solution suggested by Bakuriu and it seems to be working. But what will be the best way to combine numerous testing? Like I need to check for several conditions. Lets say:
v


efficiency for less than 40 in 10 cycles in a row, 
capacity for less than 40 in 5 cycles in row 
time for less than 40 for 25 cycles in a row
and some others...


Right now I just open csv.reader for every testing and run the function. I guess it is not the most efficient way, although it works. Sorry, I am just a complete noob.
csvfiles = glob.glob('processed_data/*.stat')
for filename in csvfiles: 

    flag=[]
    flag.append(filename[-12:-5])
    reader = csv.reader(open(filename))
    for a, row_group in enumerate(row_grouper(reader,10)):
        if all(float(row[1]) < 40 for row in row_group):         
            str1= "Efficiency is less than 40 in cycles "+ str(a+1)+'-'+str(a+10)  #i is the index of the first row in the group.
            flag.append(str1)
            break #stop processing other rows.

    reader = csv.reader(open(filename))    
    for b, row_group in enumerate(row_grouper(reader,5)):
        if all(float(row[3]) < 40 for row in row_group):
            str1= "Capacity is less than 40 minutes in cycles "+ str(a+1)+'-'+str(a+5)
            flag.append(str1)
            break #stop processing other rows.

    reader = csv.reader(open(filename))    
    for b, row_group in enumerate(row_grouper(reader,25)):
        if all(float(row[3]) < 40 for row in row_group):
            str1= "Time is less than < 40 in cycles "+ str(a+1)+'-'+str(a+25)
            flag.append(str1)
            break #stop processing other rows.

   if len(flag)>1:

       for i in flag:
            print i
        print '\n'

 解决方案 
You don't really need to sort your data, just keep track of whether the condition you're looking for has occurred in the last N rows of data. Fixed-size collections.deques are good for this sort of thing.
import csv
from collections import deque
filename = 'table.csv'
GROUP_SIZE = 5
THRESHOLD = 40
cond_deque = deque(maxlen=GROUP_SIZE)

with open(filename) as datafile:
    reader = csv.reader(datafile) # assume delimiter=','
    reader.next() # skip header row
    for linenum, row in enumerate(reader, start=1):  # process rows of file
        col0, col1, col4, col5, col6, col23, col24, col25 = (
            float(row[i]) for i in (0, 1, 4, 5, 6, 23, 24, 25))
        cond_deque.append(col1 < THRESHOLD)
        if cond_deque.count(True) == GROUP_SIZE:
            print 'lines {}-{} had {} consecutive rows with col1 < {}'.format(
                linenum-GROUP_SIZE+1, linenum, GROUP_SIZE, THRESHOLD)
            break  # found, so stop looking


                        
这篇关于在Python中排序序列的最佳方法是什么？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在Python中排序序列的最佳方法是什么？ [英] What is the best way to sort a sequence in Python?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在Python中排序序列的最佳方法是什么？ [英] What is the best way to sort a sequence in Python?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭