如何以特定顺序获取带有数据框参数的r lapply函数的结果 [英] How to get in a specific order the results of an r lapply function with arguments from a dataframe

查看:83
本文介绍了如何以特定顺序获取带有数据框参数的r lapply函数的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

跟随我问的上一个问题,我得到了一个很棒的答案.

以下是快速摘要: 我想根据南非数据计算几年的多维发展指数.我的清单是由每年的个人信息组成的,因此,基本上df1大约是1年,而df2大约是2年.

Here is a quick summary: I want to compute a multidimensional development index based on South Africa Data for several years. My list is composed of individual information for each year, so basically df1 is about year 1 and df2 about year2.

df1<-data.frame(var1=c(1, 1,1), var2=c(0,0,1), var3=c(1,1,0))
df2<-data.frame(var1=c(1, 0,1), var2=c(1,0,1), var3=c(0,1,0))
mylist <-list (df1,df2)

var1可能是每个人的宗教立场,var2是她在上次全国大选中的投票方式,等等.在我的简单案例中,我每年有3个不同人的数据. 从那里,我根据多个变量(不是全部)来计算索引 您可以在这里找到一个非常简化的工作索引函数,只有3个变量中的2个分别名为dimX和dimY:

var1 could be the stance on religion of each person, var2 how she voted in last national election, etc. In my very simple case, I have the data for 3 different persons each year. From there, I compute an index based on a number of variables (not all of them) You can find here a very simplified working index function, with only 2 of 3 variables, named dimX and dimY:

myindex <- function(x, dimX, dimY){
    econ_i<- ( x[dimX]+  x[dimY] ) 
    return ( (1/length(econ_i))*sum(econ_i) )
    }
myindex(df1, "var2", "var3")

myindex2 = function(x, d) {
    myindex(x, d[1], d[2])
}

然后,我有了要用于索引的变量数据框.我正在尝试为几组变量计算索引.

Then I have my dataframe of variables I want to use for my index. I am trying to compute the index for several sets of variables.

args <- data.frame(set1=c("var1", "var2"), set2=c("var2", "var3"), stringsAsFactors = F)

我想要的结果如下:(a)list(set1 = list(df1, df2), set2 = (df1, df2)),而不是(b)list(df1 = list(set1, set2), df2 = list(set1, set2)). 情况(a)代表一个时间序列,这意味着我每年仅列出一组变量的索引结果列表.情况(b)相反,对于每组变量,我都有一年的指数结果.每个单独的结果应为唯一的数值.因此,我期望得到一个包含2个子列表df1和df2的列表,每个子列表包含3个数字值.

I'd like to have the result as follows : (a)list(set1 = list(df1, df2), set2 = (df1, df2))instead of (b) list(df1 = list(set1, set2), df2 = list(set1, set2)). Case (a) represents a time series, meaning I have a list of results of my indexes each year for only one set of variables. Case (b) is the opposite where I have the index results of one year for every set of variables. Each individual result should be a unique numeric value. Hence, I am expecting to get a list of 2 sublists df1 and df2, each sublist containing 3 numeric values.

已建议我使用该命令:

lapply(mylist, function(m) lapply(args, myindex2, x = m))

效果很好,但是我得到的结果是错误"格式,即我展示的第二个(b). 如何获得按组(即案例(a)作为时间序列)而不是按年排序的结果?

It's working great, but I get the result in the "wrong" format, namely the second one (b) I showed. How could I get the results ordered per set (i.e. case (a) as time series) instead of per year?

非常感谢您的帮助!

PJ

编辑:我设法找到了一个解决方案,虽然不能回答问题,但仍然可以按需要的顺序获取数据. 即,我正在将列表列表转换为我只需要转置的矩阵.

EDIT: I've managed to find a solution that doesn't answer the question, but still allows me to get my data in desired order. Namely, I'm transforming my list of lists to a matrix that I simply transpose.

推荐答案

如果可以提供帮助,请

If that may provide any help, from this article, here my actual index function:

RCI_a_3det <-function(x, econ1, econ2, econ3, perso1, perso2, perso3, civic1, civic2, civic3){ 

    econ_i<- (1/3) *( x[econ1]+  x[econ2] + x[econ3]) 
    perso_i<- (1/3)*( x[perso1] + x[perso2] + x[perso3]) 
    civic_i<- (1/3)*(x[civic1] + x[civic2] + x[civic3]) 

    daf <- data.frame(econ_i, perso_i, civic_i) 
    colnames(daf)<- c("econ_i", "perso_i", "civic_i") 
    df1 <- subset(daf, daf$econ_i !=1 & daf$perso_i !=1 & daf$civic_i!=1 )

    sum_xik <- (df1$econ_i + df1$perso_i + df1$civic_i)

    return ( 1/(3*nrow(df1)) * sum(sum_xik, na.rm=T))

    }

x是每个变量和每年的所有个人信息的列表.非常大. 我正在使用9个变量来计算该索引,但是实际上我的数据中有30个这样的变量,因此我建立了一个可用于计算该索引的变量集的数据框.在简单的示例中,这相当于我的args df.我实际上正在使用200种这样的组合.

x is a list of all personal information, for every variable and for every year. It is pretty large. I am using 9 variables to compute this index, but I actually have 30 such variables in my data, so I have set up a dataframe of sets of variables I could use to compute this index. This is the equivalent of my args df in the simple example. I am actually using 200 such combinations.

这篇关于如何以特定顺序获取带有数据框参数的r lapply函数的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆