带有ddply的面板中的滞后变量 [英] Lagged variables in panels with ddply
问题描述
我试图在基本上是面板数据集中生成精确度变化(基于估计的置信区间)。
I'm trying to generate precision change (based on estimated confidence intervals) in what is in essence a panel data set.
因此,作为一个简单的例子, function
So as a simple example here's the function I've written and applying it to a non-sensical example....
precision.gain <- function(x){
x <- ts(x, start=x[1])
x.length <- seq(length = length(x))
x.lag <- lag(x, -1)
x.gain <- ((x - x.lag) * 100) / x
x.gain <- c(NA, x.gain)
x.gain
}
t <- data.frame(x=1:20)
t <- cbind(t, precision.gain(t$x))
t
x precision.gain(t$x)
1 1 NA
2 2 50.000000
3 3 33.333333
4 4 25.000000
5 5 20.000000
6 6 16.666667
7 7 14.285714
8 8 12.500000
9 9 11.111111
10 10 10.000000
11 11 9.090909
12 12 8.333333
13 13 7.692308
14 14 7.142857
15 15 6.666667
16 16 6.250000
17 17 5.882353
18 18 5.555556
19 19 5.263158
20 20 5.000000
这很有效,但我有麻烦
subset(results.normal.sum, n2 > 20 & n2 < 30, select=c(sd2, n2, ci.width1))
sd2 n2 ci.width1
11 0.4 22 0.6528714
12 0.4 24 0.6167015
13 0.4 26 0.5895856
14 0.4 28 0.5658297
46 0.6 22 0.6529126
47 0.6 24 0.6196544
48 0.6 26 0.5922061
49 0.6 28 0.5642688
81 0.8 22 0.6513849
82 0.8 24 0.6194468
83 0.8 26 0.5923094
84 0.8 28 0.5636396
116 1.0 22 0.6522927
117 1.0 24 0.6191043
118 1.0 26 0.5900129
119 1.0 28 0.5652429
151 1.2 22 0.6518072
152 1.2 24 0.6193353
153 1.2 26 0.5892683
154 1.2 28 0.5632235
186 1.4 22 0.6527031
187 1.4 24 0.6191458
188 1.4 26 0.5899453
189 1.4 28 0.5640431
221 1.6 22 0.6521401
222 1.6 24 0.6191883
223 1.6 26 0.5893458
224 1.6 28 0.5637215
256 1.8 22 0.6512491
257 1.8 24 0.6180401
258 1.8 26 0.5905810
259 1.8 28 0.5647388
291 2.0 22 0.6515769
292 2.0 24 0.6183121
293 2.0 26 0.5896990
294 2.0 28 0.5663394
我试过使用ddply
I've tried using ddply() from Hadley Wickham's plyr package.....
ddply(results.normal.sum, .(sd2), precision.gain, x=ci.width1)
Error in .fun(piece, ...) : unused argument(s) (piece)
使用tapply()直接到达那里,但它不返回一个数据框可以是cbind()....
Using tapply() directly I sort of get there, but it doesn't return a data frame which can be cbind()....
> tapply(results.normal.sum$ci.width1, sd2, precision.gain)
$`0.4`
[1] NA -771.332292 -68.852635 -30.514545 -19.877447 -14.515380
[7] -11.147183 -9.282641 -7.680418 -6.836209 -5.954992 -5.865053
[13] -4.599158 -4.198409 -4.155838 -3.529773 -3.590234 -3.432364
[19] -2.899601 -3.092533 -2.721967 -2.506706 -2.498318 -2.321500
[25] -2.299822 -2.187855 -2.116990 -1.896162 -1.853487 -1.604902
[31] -2.194138 -1.473042 -1.710051 -1.701994 -1.417754
$`0.6`
[1] NA -756.196418 -68.222048 -30.566420 -19.216860 -15.162929
[7] -10.645899 -9.628775 -7.326799 -7.178820 -5.770681 -5.367216
[13] -4.634938 -4.951049 -3.949776 -3.761633 -3.326209 -3.387764
[19] -3.009317 -3.074398 -2.397660 -2.678573 -2.626077 -2.268373
[25] -2.426720 -1.956498 -2.119986 -1.859410 -1.992678 -1.707448
[31] -1.991583 -1.595951 -1.765913 -1.415065 -1.655725
....
我发现了一个类似的问题这里,但只是不明白提供的答案/解决方案。
I found a similar question here but just do not understand the answer/solution provided.
帮助,
slackline
slackline
推荐答案
如果我正确猜测你需要什么,以下是利用 data.table
中的方便的:=
运算符的解决方案。
If I guessed correctly what you need, the following is a solution that exploits the handy :=
operator in data.table
.
首先读取示例数据:
testData <- textConnection("sd2 n2 ci.width1
11 0.4 22 0.6528714
12 0.4 24 0.6167015
13 0.4 26 0.5895856
14 0.4 28 0.5658297
46 0.6 22 0.6529126
47 0.6 24 0.6196544
48 0.6 26 0.5922061
49 0.6 28 0.5642688
81 0.8 22 0.6513849
82 0.8 24 0.6194468
83 0.8 26 0.5923094
84 0.8 28 0.5636396
116 1.0 22 0.6522927
117 1.0 24 0.6191043
118 1.0 26 0.5900129
119 1.0 28 0.5652429
151 1.2 22 0.6518072
152 1.2 24 0.6193353
153 1.2 26 0.5892683
154 1.2 28 0.5632235
186 1.4 22 0.6527031
187 1.4 24 0.6191458
188 1.4 26 0.5899453
189 1.4 28 0.5640431
221 1.6 22 0.6521401
222 1.6 24 0.6191883
223 1.6 26 0.5893458
224 1.6 28 0.5637215
256 1.8 22 0.6512491
257 1.8 24 0.6180401
258 1.8 26 0.5905810
259 1.8 28 0.5647388
291 2.0 22 0.6515769
292 2.0 24 0.6183121
293 2.0 26 0.5896990
294 2.0 28 0.5663394")
然后,将数据放入 data.table
和...
Then, put the data in a data.table
and ...
library(data.table)
dt <- data.table(read.table(testData, header = TRUE))
dt[, list(n2, ci.width1, prec.gain = precision.gain(ci.width1)), by = sd2]
以下是输出
> dt[, list(n2, ci.width1, prec.gain = precision.gain(ci.width1)), by = sd2]
sd2 n2 ci.width1 prec.gain
0.4 22 0.6528714 NA
0.4 24 0.6167015 -5.865058
0.4 26 0.5895856 -4.599146
0.4 28 0.5658297 -4.198419
0.6 22 0.6529126 NA
0.6 24 0.6196544 -5.367218
0.6 26 0.5922061 -4.634924
0.6 28 0.5642688 -4.951062
0.8 22 0.6513849 NA
0.8 24 0.6194468 -5.155907
0.8 26 0.5923094 -4.581626
0.8 28 0.5636396 -5.086548
1 22 0.6522927 NA
1 24 0.6191043 -5.360712
1 26 0.5900129 -4.930638
1 28 0.5652429 -4.382187
1.2 22 0.6518072 NA
1.2 24 0.6193353 -5.243024
1.2 26 0.5892683 -5.102430
1.2 28 0.5632235 -4.624239
1.4 22 0.6527031 NA
1.4 24 0.6191458 -5.419935
1.4 26 0.5899453 -4.949696
1.4 28 0.5640431 -4.592238
1.6 22 0.6521401 NA
1.6 24 0.6191883 -5.321774
1.6 26 0.5893458 -5.063666
1.6 28 0.5637215 -4.545560
1.8 22 0.6512491 NA
1.8 24 0.6180401 -5.373276
1.8 26 0.5905810 -4.649506
1.8 28 0.5647388 -4.575956
2 22 0.6515769 NA
2 24 0.6183121 -5.379937
2 26 0.5896990 -4.852153
2 28 0.5663394 -4.124664
cn sd2 n2 ci.width1 prec.gain
这篇关于带有ddply的面板中的滞后变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!