R 中的 For 循环是邪恶的吗? [英] Are For loops evil in R?

查看:40
本文介绍了R 中的 For 循环是邪恶的吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我听说您不打算将过程式编程风格强加到 R 上.我发现这很难.我刚刚用 for 循环解决了一个问题.这是错误的吗?有没有更好、更R 风格"的解决方案?

I've heard that you're not meant to force a procedural programming style onto R. I'm finding this pretty hard. I've just solved a problem with a for loop. Is this wrong? Is there a better, more "R-style" solution?

问题:我有两列:Col1 和 Col2.Col1 包含以自由形式输入的职位.我想使用 Col2 将这些职位分类归类(所以初级技术员"、工程技术员"和机械技术员"都被列为技术员".

The problem: I have two columns: Col1 and Col2. Col1 contains job titles that have been entered in a free form way. I want to use Col2 to collect these job titles into categories (so "Junior Technician", "Engineering technician" and "Mech. tech." are all listed as "Technician".

我是这样做的:

jobcategories<-list(
"Junior Technician|Engineering technician|Mech. tech." = "Technician",
"Manager|Senior Manager|Group manager|Pain in the ****" = "Manager",
"Admin|Administrator|Group secretary" = "Administrator")

for (currentjob in names(jobcategories)) {
  df$Col2[grep(currentjob,data$Col1)] <- jobcategories[[currentjob]]
}

这产生了正确的结果,但我无法摆脱(因为我的程序经验)我没有正确使用 R 的感觉.R 专家能让我摆脱痛苦吗?

This produces the right results, but I can't shake the feeling that (because of my procedural experience) I'm not using R properly. Could an R expert put me out of my misery?

编辑

我被要求提供原始数据.不幸的是,我无法提供它,因为它包含机密信息.它基本上是两列.第一列包含 400 多行不同的职位(以及奇怪的个人姓名).这 400 个标题可以分为大约 20 个不同的类别.第二列从 NA 开始,然后在运行 for 循环后填充.

I was asked for the original data. Unfortunately, I can't supply it, because it's got confidential info in it. It's basically two columns. The first column holds just over 400 rows of different job titles (and the odd personal name). There are about 20 different categories that these 400 titles can be split into. The second column starts off as NA, then gets populated after running the for loop.

推荐答案

for 循环在 R 中并不是邪恶的",但与基于向量的方法相比,它们通常很慢,而且通常不是最佳的可用解决方案,然而它们易于实施且易于理解,您不应低估其中任何一个的价值.

for loops are not 'evil' in R but they are typically slow compared to vector based methods and frequently not the best available solution, however they are easy to implement and easy to understand and you should not under-estimate the value of either of these.

因此,在我看来,如果您需要快速完成某件事并且找不到更好的方法,则应该使用 for 循环不需要太担心速度.

In my view, therefore, you should use a for loop if you need to get something done quickly and can't see a better way to do it and you don't need to worry too much about speed.

这篇关于R 中的 For 循环是邪恶的吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆