R:具有非数字索引的嵌套循环 [英] R: nested loop with non-numeric index

查看:34
本文介绍了R:具有非数字索引的嵌套循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是一名政治学学生,正在学习 R.我有一个嵌套循环问题,我的一个索引是非数字的.我有一个数据框 pwt 包含世界上每个国家(列 country)和从 1950 年到 2011 年的每一年(列 year)多项发展指标,其中包括GDP.我想添加一列,其中包含从一年到下一年 GDP 变化的百分比.

I am a political science student and learning R. I have a problem with a nested loop, one of my indices being non-numeric. I have a data frame pwt containing, for each country in the world (column country) and each year from 1950 to 2011 (column year) a number of development indicators, among which is GDP. I would like to add a column that contains the % change in GDP from a year to the next.

这是我得到的错误:

Error in `[<-.factor`(`*tmp*`, iseq, value = numeric(0)):  replacement has length zero

GDPgrowth = rep("NA", length(pwt$country))
pwt <- cbind.data.frame(pwt, GDPgrowth)
countries <- unique(pwt$country)
for(i in countries)  # for each country
{
  for(j in 1951:2011) # for each year
  {
    pwt[pwt$country == i & pwt$year == j,"GDPgrowth"] = (pwt[pwt$country == i 
& pwt$year == j,"rdgpo"]/pwt[pwt$country == i & pwt$year == j-1,"rdgpo"] - 
1)*100
  }
}

我做错了什么?

推荐答案

欢迎使用 Stack Overflow!

Welcome to Stack Overflow!

对于这种滚动/事物覆盖等,您可以使用 动物园dplyrdata.table.我个人更喜欢后者,因为它的灵活性和(运行)大型数据集的速度.对比使用循环,这些通常会更快,在语法上更方便.

For this sort of rolling/thing-over-thing, etc. you can use zoo, dplyr, or data.table. I personally prefer the latter for its flexibility and (running) speed for large datasets. Vs. using a loop, these will generally be faster and more syntactically convenient.

假设您的数据看起来像这样(数字显然是编造的):

Assuming your data looks something like this (numbers obviously made up):

country year rgdp
USA     1991 1000
USA     1992 1200
USA     1993 1500
SWE     1991 1000
SWE     1992 900
SWE     1993 2000

您可以使用 data.table 的移位从前导/滞后值计算值.在这种情况下:

You can use data.table's shift to calculate values from leading/lagging values. In this case:

library(data.table)

pwt <- as.data.table(list(country=c("USA", "USA", "USA", "SWE", "SWE", "SWE"),
                          year=c(1991, 1992, 1993, 1991, 1992, 1993),
                          rgdp=c(1000, 1200, 1500, 1000, 900, 2000)))

pwt[, growth := rgdp/shift(rgdp, n=1, type="lag") - 1, by=c("country")]

给出:

country year rgdp growth
USA     1991 1000 NA
USA     1992 1200 0.200000
USA     1993 1500 0.250000
SWE     1991 1000 NA
SWE     1992 900 -0.100000
SWE     1993 2000 1.222222

这篇关于R:具有非数字索引的嵌套循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆