R:如何从数据框中获取正确的乳胶回归表? [英] R :How to get a proper latex regression table from a dataframe?

查看:105
本文介绍了R:如何从数据框中获取正确的乳胶回归表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑以下示例

inds <- c('var1','','var2','')
model1 <- c(10.2,0.00,0.02,0.3)
model2 <- c(11.2,0.01,0.02,0.023)

df = df=data.frame(inds,model1,model2)
df
 inds model1 model2
 var1  10.20 11.200
        0.00  0.010
 var2   0.02  0.020
        0.30  0.023

在这里,您将获得带有系数和P值的自定义回归模型的输出(如果需要说系数的标准误差,我实际上可以显示任何其他统计信息).

Here you have the output of a custom regression model with coefficients and P-values (I actually can show any other statistics if I need to, say, the standard errors of the coefficients).

有两个变量,var1var2.

例如,在模型1中,var1具有系数10.2和P值0.00,而var2具有系数0.02和P值0.30.

For instance, in model1, var1 comes with a coefficient of 10.2 and a P-value of 0.00 while var2 has a coefficient of 0.02 and a P-value of 0.30.

是否有一个程序包可以自动处理这些(自定义)表格,并且可以创建一个带有星号的整洁的Latex表格以表示重要性?

Is there a package that handle these (custom) tables automatically and can create a neat Latex table with stars for significance?

谢谢!

推荐答案

以下是使用texreg的解决方案.

Here is a solution using texreg.

请注意,texreg> = 1.36.18是必需的.

Note that texreg >= 1.36.18 is required.

您在数据框中提供的信息(系数和p值)可以在数据框中以任意方式排列.因此,我们需要编写代码以从数据框中的适当位置选择这些数据,并使用它们创建一个texreg对象.当您请求通用(可能是可重用)的解决方案时,我们应该将代码包装在可重用函数中.我将其称为extractFromDataFrame.因此,这里是该函数,该函数从数据框中提取信息并为不同模型创建texreg对象的列表:

The information you are providing in the data frame (coefs and p-values) could be arranged in arbitrary ways in a data frame. Therefore we need to write code that selects these data from the appropriate places in the data frame and uses them to create a texreg object. As you are requesting a generic (and presumably re-usable) solution, we should wrap the code in a re-usable function. I'll call this function extractFromDataFrame. So here is the function, which extracts the information from the data frame and creates a list of texreg objects for the different models:

require("texreg")

extractFromDataFrame <- function (dataFrame) {
  coef.row.indices <- seq(1, nrow(dataFrame) - 1, 2)
  pval.row.indices <- seq(2, nrow(dataFrame), 2)
  texregObjects <- list()
  for (i in 2:ncol(dataFrame)) {
    coefs <- dataFrame[coef.row.indices, i]
    coefnames <- as.character(dataFrame[coef.row.indices, 1])
    pvalues <- dataFrame[pval.row.indices, i]
    tr <- createTexreg(coef = coefs, coef.names = coefnames, pvalues = pvalues)
    texregObjects[i - 1] <- list(tr)
  }
  return(texregObjects)
}

在此函数中,我们首先定义系数存储在数据帧的哪些行中以及p值存储在哪些行中.然后,我们创建了一个空列表,在其中存储了texreg对象.我们遍历所有列,但第一列只包含标签.在每个模型列中,我们保存系数,其名称和p值,然后将其移交给createTexreg构造函数,该构造函数基于以下内容为我们创建texreg对象:数据.我们将texreg对象添加到列表中.最后,我们返回texreg对象的列表.

In this function, we first define in which rows of the data frame the coefficients are stored and in which rows the p-values are stored. Then we created an empty list in which we stored the texreg objects. We iterate through all columns but the first as the first one contains only the labels. In each of these model columns, we save the coefficients, their names, and the p-values, and then we hand them over to the createTexreg constructor, which is a function that creates a texreg object for us based on the data. We add the texreg object to the list. In the end, we return the list of texreg objects.

现在,我们可以将函数应用于任何具有任意列数(> 1)的数据框,该数据框看起来像问题中提供的数据框一样.在这种情况下,将函数应用到df对象之后,如果我们想确保我们做对了所有事情,我们可能希望打印列表的内容:

We can now apply the function to any data frame that looks like the one provided in the question, with arbitrary numbers of columns (> 1). In this case, after applying the function to the df object, we may want to print the contents of the list if we want to make sure that we did everything right:

tr <- extractFromDataFrame(df)
tr

实际上,结果包含相关数据:

And indeed, the results contain the relevant data:

[[1]]

No standard errors were defined for this texreg object.
No decimal places were defined for the GOF statistics.

     coef.   p
var1 10.20 0.0
var2  0.02 0.3

No GOF block defined.

[[2]]

No standard errors were defined for this texreg object.
No decimal places were defined for the GOF statistics.

     coef.     p
var1 11.20 0.010
var2  0.02 0.023

No GOF block defined.

现在,我们可以简单地将texreg对象的列表移交给screenreg,例如screenreg(tr),具有以下结果:

Now we can simply hand the list of texreg objects over to screenreg, e.g., screenreg(tr), with the following result:

========================
      Model 1    Model 2
------------------------
var1  10.20 ***  11.20 *
var2   0.02       0.02 *
========================
*** p < 0.001, ** p < 0.01, * p < 0.05

或按htmlreg创建HTML表.或者,按照原始问题的要求,输入texreg来创建LaTeX表. texreg(tr, single.row = TRUE)的输出如下所示:

Or to htmlreg for creating an HTML table. Or, as requested in the original question, to texreg for creating a LaTeX table. The output of texreg(tr, single.row = TRUE) looks like this:

\begin{table}
\begin{center}
\begin{tabular}{l c c }
\hline
 & Model 1 & Model 2 \\
\hline
var1 & $10.20^{***}$ & $11.20^{*}$ \\
var2 & $0.02$        & $0.02^{*}$  \\
\hline
\multicolumn{3}{l}{\scriptsize{$^{***}p<0.001$, $^{**}p<0.01$, $^*p<0.05$}}
\end{tabular}
\caption{Statistical models}
\label{table:coefficients}
\end{center}
\end{table}

可以修改此解决方案以适应标准误差,置信区间或拟合优度统计.

This solution can be modified to accommodate standard errors, confidence intervals, or goodness-of-fit statistics.

各种texreg参数可用于自定义输出,例如,使用booktabs包或通过dcolumn进行十进制对齐.

Various texreg arguments can be used to customize the output, including the use of the booktabs package or decimal alignment via dcolumn, for example.

请注意,您不应调用数据框df,因为该对象名称已在stats程序包中定义.

Please note that you should not call your data frame df because that object name is already defined in the stats package.

这篇关于R:如何从数据框中获取正确的乳胶回归表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆