使用python向组中的每个元素添加序列号 [英] Add a sequence number to each element in a group using python

查看:2133
本文介绍了使用python向组中的每个元素添加序列号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个个人的数据框架,每个人都有多个记录。我想枚举python中每个人的序列中的记录。基本上我想在下表中创建序列列:

I have a dataframe of individuals who each have multiple records. I want to enumerate the record in the sequence for each individual in python. Essentially I would like to create the 'sequence' column in the following table:

patient  date      sequence
145      20Jun2009        1
145      24Jun2009        2
145      15Jul2009        3
582      09Feb2008        1
582      21Feb2008        2
987      14Mar2010        1
987      02May2010        2
987      12May2010        3

这本质上和这里,但我在python工作,实现sql解决方案。我怀疑我可以使用带有可迭代计数的groupby语句,但是迄今为止都不成功。谢谢!

This is essentially the same question as here, but I am working in python and unable to implement the sql solution. I suspect I can use a groupby statement with an iterable count, but have so far been unsuccessful. Thanks!

推荐答案

问题是如何对多列数据进行排序。

The question is how do I sort on multiple columns of data.

一个简单的窍门是使用参数到 sorted function。

One simple trick is to use the key parameter to the sorted function.

你将通过从数组的列中建立的字符串进行排序。

You'll be sorting by a string built from the columns of the array.

rows = ...# your source data

def date_to_sortable_string(date):
  # use datetime package to convert string to sortable date.
  pass

# Assume x[0] === patient_id and x[1] === encounter date

# Sort by patient_id and date
rows_sorted = sorted(rows, key=lambda x: "%0.5d-%s" % (x[0], date_to_sortable_string(x[1])))

for row in rows_sorted:
  print row

这篇关于使用python向组中的每个元素添加序列号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆