根据具体的行值将列添加到数据帧 [英] Add column to dataframe depending on specific row values

查看:110
本文介绍了根据具体的行值将列添加到数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



这是一个我的data.frame的例子,我希望能够与...一起工作我真正的一个。

  df<  -  read.table(text ='ID Day Count 
33012 9526 4
35004 9526 4
37006 9526 4
37008 9526 4
21009 1913 3
24005 1913 3
25009 1913 3
22317 2286 2
37612 2286 2
25009 14329 1
48007 9525 0
88662 9524 0
1845 9524 0
8872 2285 0
49002 1912 0
1664 1911 0',header = TRUE)

我需要添加一个新列( new_col )到我的data.frame,其中包含1到4的值。这些 new_col 值必须包括每一个x)day(x -1)和day(x -2),其中x = 9526,1913,2286,14329(列 Day )。



我的输出应该如下:

  ID日数计数new_col 
33012 9526 4 1
35004 9526 4 1
37006 9526 4 1
37008 9526 4 1
21009 1913 3 2
24005 1913 3 2
25009 1913 3 2
22317 2286 2 3
37612 2286 2 3
25009 14329 1 4
48007 9525 0 1
88662 9524 0 1
1845 9524 0 1
8872 2285 0 3
49002 1912 0 2
1664 1911 0 2

data.frame然后由 new_col 订购

  ID日数new_col 
33012 9526 4 1
35004 9526 4 1
37006 9526 4 1
37008 9526 4 1
48007 9525 0 1
88662 9524 0 1
1845 9524 0 1
21009 1913 3 2
24005 1913 3 2
25009 1913 3 2
49002 1912 0 2
1664 1911 0 2
22317 2286 2 3
37612 2286 2 3
8872 2285 0 3
25009 14329 1 4

我的实际数据框比例子更复杂 Count 列中的更多列和更多值,因此如果我会更新问题,请耐心等待。



任何建议将非常有用。

解决方案

我不知道我完全理解你的问题,但是似乎你可以使用 cut()实现这一点,如下所示:

  x<  -  c (1913,2286,9526,14329)
df $ new_col< - cut(df $ Day,c(-Inf,x,Inf))
df $ new_col< - as.numeric (df $ new_col,levels = unique(df $ new_col)))


I am trying to solve something that for me is a problem since a few days.

Here an example of my data.frame, which I hope will work with my real one.

df <- read.table(text = 'ID    Day Count
    33012   9526    4
    35004   9526    4
    37006   9526    4
    37008   9526    4
    21009   1913    3
    24005   1913    3
    25009   1913    3
    22317   2286    2
    37612   2286    2
    25009   14329   1
    48007   9525    0
    88662   9524    0
    1845    9524    0
    8872    2285    0
    49002   1912    0
    1664    1911    0', header = TRUE)

I need to add a new column (new_col) to my data.frame which contains values from 1 to 4. These new_col values have to include, each one, day (x) day (x -1) and day (x -2), where x = 9526, 1913, 2286, 14329 (column Day).

My output should be the following:

   ID    Day Count  new_col
33012   9526    4     1
35004   9526    4     1
37006   9526    4     1
37008   9526    4     1
21009   1913    3     2
24005   1913    3     2
25009   1913    3     2
22317   2286    2     3
37612   2286    2     3
25009   14329   1     4
48007   9525    0     1
88662   9524    0     1
1845    9524    0     1
8872    2285    0     3
49002   1912    0     2
1664    1911    0     2

The data.frame ordered by new_col will be then:

   ID    Day Count  new_col
33012   9526    4     1
35004   9526    4     1
37006   9526    4     1
37008   9526    4     1
48007   9525    0     1
88662   9524    0     1
1845    9524    0     1
21009   1913    3     2
24005   1913    3     2
25009   1913    3     2
49002   1912    0     2
1664    1911    0     2
22317   2286    2     3
37612   2286    2     3
8872    2285    0     3
25009   14329   1     4

My real data.frame is more complex than the example (i.e. more columns and more values in the Count column, therefore be patient if I will update the question.

Any suggestion will be really helpful.

解决方案

I'm not sure I totally understand your question, but it seems like you could use cut() to achieve this, as follows:

x <- c(1913, 2286, 9526, 14329) 
df$new_col <- cut(df$Day, c(-Inf, x, Inf))
df$new_col <- as.numeric(factor(df$new_col, levels=unique(df$new_col)))

这篇关于根据具体的行值将列添加到数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆