R:具有非数字索引的嵌套循环 [英] R: nested loop with non-numeric index
问题描述
我是一名政治学学生,正在学习 R.我有一个嵌套循环问题,我的一个索引是非数字的.我有一个数据框 pwt
包含世界上每个国家(列 country
)和从 1950 年到 2011 年的每一年(列 year
)多项发展指标,其中包括GDP.我想添加一列,其中包含从一年到下一年 GDP 变化的百分比.
I am a political science student and learning R. I have a problem with a nested loop, one of my indices being non-numeric.
I have a data frame pwt
containing, for each country in the world (column country
) and each year from 1950 to 2011 (column year
) a number of development indicators, among which is GDP.
I would like to add a column that contains the % change in GDP from a year to the next.
这是我得到的错误:
Error in `[<-.factor`(`*tmp*`, iseq, value = numeric(0)): replacement has length zero
GDPgrowth = rep("NA", length(pwt$country))
pwt <- cbind.data.frame(pwt, GDPgrowth)
countries <- unique(pwt$country)
for(i in countries) # for each country
{
for(j in 1951:2011) # for each year
{
pwt[pwt$country == i & pwt$year == j,"GDPgrowth"] = (pwt[pwt$country == i
& pwt$year == j,"rdgpo"]/pwt[pwt$country == i & pwt$year == j-1,"rdgpo"] -
1)*100
}
}
我做错了什么?
推荐答案
欢迎使用 Stack Overflow!
Welcome to Stack Overflow!
对于这种滚动/事物覆盖等,您可以使用 动物园、dplyr 或 data.table.我个人更喜欢后者,因为它的灵活性和(运行)大型数据集的速度.对比使用循环,这些通常会更快,在语法上更方便.
For this sort of rolling/thing-over-thing, etc. you can use zoo, dplyr, or data.table. I personally prefer the latter for its flexibility and (running) speed for large datasets. Vs. using a loop, these will generally be faster and more syntactically convenient.
假设您的数据看起来像这样(数字显然是编造的):
Assuming your data looks something like this (numbers obviously made up):
country year rgdp
USA 1991 1000
USA 1992 1200
USA 1993 1500
SWE 1991 1000
SWE 1992 900
SWE 1993 2000
您可以使用 data.table 的移位从前导/滞后值计算值.在这种情况下:
You can use data.table's shift to calculate values from leading/lagging values. In this case:
library(data.table)
pwt <- as.data.table(list(country=c("USA", "USA", "USA", "SWE", "SWE", "SWE"),
year=c(1991, 1992, 1993, 1991, 1992, 1993),
rgdp=c(1000, 1200, 1500, 1000, 900, 2000)))
pwt[, growth := rgdp/shift(rgdp, n=1, type="lag") - 1, by=c("country")]
给出:
country year rgdp growth
USA 1991 1000 NA
USA 1992 1200 0.200000
USA 1993 1500 0.250000
SWE 1991 1000 NA
SWE 1992 900 -0.100000
SWE 1993 2000 1.222222
这篇关于R:具有非数字索引的嵌套循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!