R假人通过.Rmd编织时包装奇怪的列名 [英] R dummies package weird column names when knitted via .Rmd

查看:61
本文介绍了R假人通过.Rmd编织时包装奇怪的列名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚注意到在.Rmd中编织时R的dummies包中有一个非常奇怪的行为.这是可复制的示例.

I've just noticed a very weird behavior in the dummies package of R when knitted in .Rmd. Here's the reproducible example.

---
title: "Dummies Package Behavior"
author: "Kim"
date: '`r Sys.Date()`'
output:
  pdf_document:
    toc: yes
    toc_depth: '3'
---

Load the libraries

```{r}
library(tidyverse)
library(dummies)
```

Main data wrangling

```{r}
df <- data_frame(year = c(2016, 2017, 2018))
temp <- dummy(df$year)
temp <- as_data_frame(temp)
df <- bind_cols(df, temp)
```

View output

```{r}
df
```

当我查看df时期望看到的是year2016year2017year2018的0-1列,这是dummies包的正常行为.

What I'm expecting to see when I view the df are nice 0-1 columns of year2016, year2017, and year2018, which is the normal behavior for the dummies package.

在RStudio中编织此R Markdown文档时,它会显示以下内容:C:/Users/Kim/Desktop/dummies.Rmd2016C:/Users/Kim/Desktop/dummies.Rmd2017C:/Users/Kim/Desktop/dummies.Rmd2018.即,它使用整个文档地址作为列名.

When you knit this R Markdown document in RStudio, it instead brings out the following: C:/Users/Kim/Desktop/dummies.Rmd2016, C:/Users/Kim/Desktop/dummies.Rmd2017, and C:/Users/Kim/Desktop/dummies.Rmd2018. That is, it uses the whole document address to make the column names.

我不明白为什么会发生这种行为.显然,我希望列名称为year2016year2017year2018.

I don't understand why such behavior occurs. Obviously, I want to have column names as year2016, year2017, and year2018.

推荐答案

该问题与dplyr无关,因为我们可以使用data.frame()重现它.当作为R Markdown文档的一部分执行时,在dummy()函数中分配列标签显然存在问题.如Luke的回答所述,一种解决方法是使用dummy.data.frame().另一个方法是在将年份和虚拟变量与cbind()绑定后,使用colnames()函数重命名列,这也启用了基于dplyr的解决方案.

The problem is not related to dplyr because we can reproduce it with data.frame(). Apparently there is a problem with assigning column labels in the dummy() function when executed as part of an R Markdown document. As noted in Luke's answer, one workaround is to use dummy.data.frame(). Another would be to use the colnames() function to rename the columns after binding the year and dummy variables with cbind(), which also enables a dplyr-based solution.

这可能应该作为dummies软件包的错误报告提交.

This should probably be submitted as a bug report for the dummies package.

---
title: "Behavior of dummies package"
author: "anAuthor"
date: "12/26/2017"
output:
  html_document: default
  pdf_document: default
  word_document: default
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

# first, reproduce error with data.frame()

```{r}
library(dummies)
df <- data.frame(year = c(2016, 2017, 2018))
df
dummyCols <- dummy(df$year)
dummyCols <- as.data.frame(dummyCols)
dummyCols
```

# data.frame() approach to fix the error

```{r}
df <- data.frame(year = c(2016, 2017, 2018))
df
dummyCols <- dummy.data.frame(data=df,dummy.classes="ALL")
dummyCols
df <- cbind(df, dummyCols)
df
```

...和输出,首先重现错误.

...and the output, first reproducing the error.

...秒,使用dummies.data.frame()避免错误.

...second, using dummies.data.frame() to avoid the error.

dplyr校正的工作原理如下.

The dplyr correction works as follows.

# dplyr approach 

```{r}
library(tidyverse)
df <- data_frame(year = c(2016, 2017, 2018))
temp <- dummy(df$year)
temp <- as_data_frame(temp)
df <- bind_cols(df, temp)
colnames(df) <- c("year",unlist(lapply(2016:2018,function(x) {
     paste("year",x,sep="")
})))
df
```

这篇关于R假人通过.Rmd编织时包装奇怪的列名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆