如何产生“范围”变量在R? [英] How to generate a "range" variable in R?

查看:198
本文介绍了如何产生“范围”变量在R?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集,看起来像这样:

 主题年份X 
A 1990 1
A 1991 1
A 1992 2
A 1993 3
A 1994 4
A 1995 4
B 1990 0
B 1991 1
B 1992 1
B 1993 2
C 1991 1
C 1992 2
C 1993 3
C 1994 3
D 1991 1
D 1992 2
D 1993 3
D 1994 4
D 1995 5
D 1996 5
D 1997 6
pre>

我想生成一个二进制(0/1)变量(让我们说变量A),表示X变量达到3(或1-3)的天气,为每个主题。如果X变量达到4以上,则A不能捕获。



应该如下所示:

 主题年份XA 
A 1990 1 0
A 1991 1 0
A 1992 2 0
A 1993 3 0
A 1994 4 0
A 1995 4 0
B 1990 0 0
B 1991 1 0
B 1992 1 0
B 1993 2 0
C 1991 1 1
C 1992 2 1
C 1993 3 1
C 1994 3 1
D 1991 1 0
D 1992 2 0
D 1993 3 0
D 1994 4 0
D 1995 5 0
D 1996 5 0
D 1997 6 0

我尝试过以下操作: mydata $ A < - as.numeric(mydata $ X%in%1:3),但不能继续执行....



可重现的样本:

 > dput(mydata)
structure(list(Subject = structure(c(1L,1L,1L,1L,1L,1L,
2L,2L,2L,2L,3L,3L,3L,3L, 4L,4L,4L,4L,4L,4L,4L),.Label = c(A,
B,C,D),class =factor c(1990L,1991L,1992L,
1993L,1994L,1995L,1990L,1991L,1992L,1993L,1991L,1992L,
1993L,1994L,1991L,1992L,1993L,1994L,1995L, (1L,1L,2L,3L,4L,4L,0L,1L,1L,2L,1L,2L,3L,
3L,1L,2L,3L,4L ,5L,5L,6L)),.Names = c(Subject,Year,
X),class =data.frame,row.names = c(NA,-21L) )

欢迎所有的建议 - 谢谢!

解决方案

这是一个基本的R单行使用 ave p>

  df $ A<  -  ave(df $ X,df $ Subject,FUN = function(x)if(max(x) == 3)1 else 0)

> df
主题年份XA
1 A 1990 1 0
2 A 1991 1 0
3 A 1992 2 0
4 A 1993 3 0
5 A 1994 4 0
6 A 1995 4 0
7 B 1990 0 0
8 B 1991 1 0
9 B 1992 1 0
10 B 1993 2 0
11 C 1991 1 1
12 C 1992 2 1
13 C 1993 3 1
14 C 1994 3 1
15 D 1991 1 0
16 D 1992 2 0
17 D 1993 3 0
18 D 1994 4 0
19 D 1995 5 0
20 D 1996 5 0
21 D 1997 6 0


I have a dataset that looks something like this:

    Subject Year    X
       A    1990    1
       A    1991    1
       A    1992    2
       A    1993    3
       A    1994    4
       A    1995    4
       B    1990    0
       B    1991    1
       B    1992    1
       B    1993    2
       C    1991    1
       C    1992    2
       C    1993    3
       C    1994    3
       D    1991    1
       D    1992    2
       D    1993    3
       D    1994    4
       D    1995    5
       D    1996    5
       D    1997    6

I want to generate a binary(0/1) variable (let's say variable A) that indicates weather the X variables has reached 3 (or 1-3), for each Subject. If the X variable has reached 4 or more, the A should not capture it.

It should look like this:

Subject Year    X   A
   A    1990    1   0
   A    1991    1   0
   A    1992    2   0
   A    1993    3   0
   A    1994    4   0
   A    1995    4   0
   B    1990    0   0
   B    1991    1   0
   B    1992    1   0
   B    1993    2   0
   C    1991    1   1
   C    1992    2   1
   C    1993    3   1
   C    1994    3   1
   D    1991    1   0
   D    1992    2   0
   D    1993    3   0
   D    1994    4   0
   D    1995    5   0
   D    1996    5   0
   D    1997    6   0

I tried the following: mydata$A<- as.numeric(mydata$X %in% 1:3)but it doesn't control for the continuation....

A reproducible sample:

> dput(mydata)
structure(list(Subject = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("A", 
"B", "C", "D"), class = "factor"), Year = c(1990L, 1991L, 1992L, 
1993L, 1994L, 1995L, 1990L, 1991L, 1992L, 1993L, 1991L, 1992L, 
1993L, 1994L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L
), X = c(1L, 1L, 2L, 3L, 4L, 4L, 0L, 1L, 1L, 2L, 1L, 2L, 3L, 
3L, 1L, 2L, 3L, 4L, 5L, 5L, 6L)), .Names = c("Subject", "Year", 
"X"), class = "data.frame", row.names = c(NA, -21L))

All suggestions are welcome – thanks!

解决方案

Here's a base R one-liner use ave:

df$A <- ave(df$X, df$Subject, FUN = function(x) if (max(x) == 3) 1 else 0)

> df
   Subject Year X A
1        A 1990 1 0
2        A 1991 1 0
3        A 1992 2 0
4        A 1993 3 0
5        A 1994 4 0
6        A 1995 4 0
7        B 1990 0 0
8        B 1991 1 0
9        B 1992 1 0
10       B 1993 2 0
11       C 1991 1 1
12       C 1992 2 1
13       C 1993 3 1
14       C 1994 3 1
15       D 1991 1 0
16       D 1992 2 0
17       D 1993 3 0
18       D 1994 4 0
19       D 1995 5 0
20       D 1996 5 0
21       D 1997 6 0

这篇关于如何产生“范围”变量在R?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆