根据其他两个列的值创建一个新的数据框列 [英] Create a new data frame column based on the values of two other columns

查看:92
本文介绍了根据其他两个列的值创建一个新的数据框列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

比方说,我有两个变量和213005个观测值的数据框,看起来像这样:

Let's say I have data frame with two variables and 213005 observations, it looks like that:

df <- data.frame(nr=c(233, 233, 232, 231, 234, 234, 205), 
        date=c("2012/01/02", "2012/01/01", "2012/01/01", "2012/01/02", "2012/01/01", "2012/01/01", "2012/01/05"))

我需要根据日期值为每个不同的 nr值创建一个名为 new的新列,它应如下所示:

I need to create a new column called "new" for each different "nr" value according to "date" value, it should look like this:

df <- data.frame(nr=c(233, 233, 232, 231, 234, 234, 205), 
        date=c("2012/01/02", "2012/01/01", "2012/01/01", "2012/01/02", 
                  "2012/01/01", "2012/01/01", "2012/01/05"), 
        new=c(1, 2, 3, 4, 5, 5, 6))

(nr = 233,date = 2012/01/02)=>(new = 1)

(nr=233, date=2012/01/02) => (new=1)

(nr = 233,date = 2012/01/01)=>(new = 2)...

(nr=233, date=2012/01/01) => (new=2) ...

for(nr = 234,date = 2012/01/01)应该有两个相同的列,其中new = 5,重复的行应保留在数据f中

for (nr=234, date=2012/01/01) there should be two the same columns with new=5, repeated lines should stay in data frame.

有人知道怎么做吗?任何帮助将不胜感激!
谢谢!

Does anyone knows how to do that? Any help would be very appreciated! Thank you!

推荐答案

我不确定我是否完全理解逻辑,但似乎您想按两列进行分组,这是使用 .GRP

I'm not entirely sure I understand the logic, but it seems like you want to group by both columns, here's a simple data.table solution using .GRP

library(data.table)
setDT(df)[, new := .GRP, .(nr, date)][]
#     nr       date new
# 1: 233 2012/01/02   1
# 2: 233 2012/01/01   2
# 3: 232 2012/01/01   3
# 4: 231 2012/01/02   4
# 5: 234 2012/01/01   5
# 6: 234 2012/01/01   5
# 7: 205 2012/01/05   6

这篇关于根据其他两个列的值创建一个新的数据框列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆