对python列表中的连续项目对进行排序 [英] Sorting consecutive pairs of items in a python list

查看:68
本文介绍了对python列表中的连续项目对进行排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我拥有的数据实际上包含在 Pandas 数据框中(在列上),但为了这篇文章,我们将其提取出来以解决问题的核心.

The data that I have is actually contained in pandas dataframe (on a column) but for the sake of this post, we extract it to get to the nub of the problem.

假设我们有一个数据帧 df,其中有一列 col1,我们将其存储为一个列表:L = df.col1.tolist().现在,我有大约 2000 个这样的列/列表,平均它们的长度约为 300-400.所以这里没有大量的性能需求.

Suppose we have a dataframe df with a column col1 which we store as a list: L = df.col1.tolist(). Now, I have about 2000 of these columns/lists and on average they have a length of about 300-400. So there is no massive need for performance here.

回到我们的 MWE 列表,它的结构是这样的(ish):

Back to our MWE list, it is structured with items like this (ish):

L = [1,2,2,1,3,3,4,4,5,5,6,6,1,2,1,2,7,7,8,8]

现在列表中的项目应该结构化是连续对的结构(但出于数据收集的原因,它们不是).所以这是我们的目标排序列表:

Now the way the items in the list should be structured is that of consecutive pairs (but for data-collection reasons, they are not). So here is the sorted list we are aiming for:

L = [1,1,2,2,3,3,4,4,5,5,6,6,1,1,2,2,7,7,8,8]

为了清楚起见,我将这些添加为元组:

I have added these as tuples just for clarity:

L = [(1,1),(2,2),(3,3),(4,4),(5,5),(6,6),(1,1),(2,2),(7,7),(8,8)]

这个问题:列包含几乎连续的项目对(上面例子中的数字),但其中一些是乱序的,必须移回它们的伙伴(见上文)).

This the problem: the columns contain almost sequential pairs of items (the numbers in the above example) but some of them are out of order and have to be moved back to their partner (see above).

需要注意的几点:

  • 上面的列表包含数字,实际上,我们正在处理字符串
  • 数据通常位于 Pandas 数据框中的一列(不确定这是否有帮助,但可能)
  • 性能不是真正的问题,因为它们只需要排序一次
  • 乱序模式不一致,并且每列中的内容都有很多变化,重要的是每个项目都映射回其合作伙伴.
  • The above list contains numbers, in actuality, we are dealing with strings
  • The data typically lives on a column in a pandas dataframe (not sure if this helps but it may)
  • Performance is not really a problem since they will only need to be sorted once
  • The out-of-order pattern is not consistent and things move around a lot in each column, what is important is that each item is mapped back to its partner.

我正在寻找一种可以将这些列表/列按所需的成对顺序排序的方法.谢谢!

I am looking for a method that can sort these lists/columns into the required pair-sequential order. Thanks!

推荐答案

好的,既然你可以保证它们总是配对的,我只是保持一个运行计数,你基本上只需要生成一个列表中的元素遇到该对中的第一个项目的顺序(因此当计数等于 0 时),并且当计数达到 2 时,重置该项目的计数.然后只需将这个第一个元素的列表按顺序分解"成一个对的列表,如此快速和肮脏:

OK, since you can guarantee that they are always paired, I'd just keep a running count and you basically just need to generate a list of the elements in the order that the first item in the pair is encountered (so when the count is equal to zero), and when the count gets to 2, reset the count for that item. Then just "explode" this list of the first elements in order into a list of the pairs, so quick and dirty:

In [1]: L = [1,2,2,1,3,3,4,4,5,5,6,6,1,2,1,2,7,7,8,8]

In [2]: from collections import Counter

In [3]: counts = Counter()

In [4]: order = []

In [5]: for x in L:
   ...:     n = counts[x]
   ...:     if n == 0:
   ...:         order.append(x)
   ...:         counts[x] += 1
   ...:     elif n == 2:
   ...:         counts[x] = 0
   ...:     else:
   ...:         counts[x] += 1
   ...:

In [6]: order
Out[6]: [1, 2, 3, 4, 5, 6, 1, 2, 7, 8]

In [7]: result = []

In [8]: for x in order:
   ...:     result.append(x)
   ...:     result.append(x)
   ...:

In [9]: result
Out[9]: [1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 1, 1, 2, 2, 7, 7, 8, 8]

当然,您应该创建一个函数来执行此操作.

Of course, you should make a function to do this.

这篇关于对python列表中的连续项目对进行排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆