从一个字符串变量创建多个虚拟变量 [英] Create several dummy variables from one string variable

查看：193 发布时间：2020/10/16 21:54:58 r string dataframe split splitstackshape

本文介绍了从一个字符串变量创建多个虚拟变量的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经尝试了这个类似的问题中的几乎所有内容，但我无法获得其他所有人似乎都得出的结果得到。这是我的问题：

I've tried pretty much everything from this similar question, but I can't get the results everyone else seems to be getting. This is my problem:

我有一个这样的数据框，列出了每个老师的工作成绩：

I have a data frame like this, listing the grades each teacher works with:

> profs <- data.frame(teaches = c("1st", "1st, 2nd",
                                  "2nd, 3rd",
                                  "1st, 2nd, 3rd"))
> profs
        teaches
1           1st
2      1st, 2nd
3      2nd, 3rd
4 1st, 2nd, 3rd

我一直在寻找将教学变量分为几列的解决方案，例如：

I've been looking for solutions to break the teaches variable into columns, like so:

  teaches1st teaches2nd teaches3rd
1          1          0          0
2          1          1          0
3          0          1          1
4          1          1          1

我了解此解决方案，其中涉及到 splitstackshape 库和显然已弃用的 concat.split.expanded 函数应该可以完全实现我想要的功能。但是，我似乎无法达到相同的结果：


I understand this solution involving the splitstackshape library and the apparently deprecated concat.split.expanded function is supposed to do exactly what I want, given the answerer's explanation. However, I can't seem to reach the same results:
> concat.split.expanded(profs, "teaches", fill = 0, drop = TRUE)
Fehler in seq.default(min(vec), max(vec)) : 
  'from' cannot be NA, NaN or infinite

使用 cSplit ，我理解它取代了大多数早期的concat.split *函数，我得到了：
Using cSplit, which I understood supersedes "most of the earlier concat.split* functions", I get this:
> cSplit(profs, "teaches")
   teaches_1 teaches_2 teaches_3
1:       1st        NA        NA
2:       1st       2nd        NA
3:       2nd       3rd        NA
4:       1st       2nd       3rd

我尝试使用 cSplit '的帮助并调整其中的每个参数，但是我无法做到这一点。感谢您的帮助。
I've tried using cSplit's help and tweaking every one of those parameters, but I just can't get that split. I appreciate any help.
推荐答案
由于您的串联数据是串联的字符串（不是串联的数值），因此需要添加 type = character 使该功能按预期运行。
Since your concatenated data are concatenated character strings (not concatenated numerical values) you'll need to add type = "character" to get the function to work as you expect it.
该功能的默认设置是数值，因此出现关于 NaN 的错误，依此类推。
The function's default setting is for numeric values, hence the error about NaN and so on.
命名已与缩写更一致同一家族中其他功能的形式。因此，它现在是 cSplit_e （尽管旧函数名仍然可以使用）。
The naming has been made more consistent with the short forms of the other functions in the same family. Thus, it is now cSplit_e (though the old function name would still work).
library(splitstackshape)
cSplit_e(profs, "teaches", ",", type = "character", fill = 0)
#         teaches teaches_1st teaches_2nd teaches_3rd
# 1           1st           1           0           0
# 2      1st, 2nd           1           1           0
# 3      2nd, 3rd           0           1           1
# 4 1st, 2nd, 3rd           1           1           1

 ？concat.split.expanded 的帮助页面与 cSplit_e 。如果您有什么技巧需要更清楚地了解，请在软件包的GitHub页面上提出问题。
The help page for ?concat.split.expanded is the same as that of cSplit_e. If you have any tips on making it clearer to understand, please raise an issue at the package's GitHub page.

                        这篇关于从一个字符串变量创建多个虚拟变量的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

从一个字符串变量创建多个虚拟变量 [英] Create several dummy variables from one string variable

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

从一个字符串变量创建多个虚拟变量 [英] Create several dummy variables from one string variable

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭