如果用户定义函数中的语句在R中适用 [英] If statement in user defined function within apply in R

查看:133
本文介绍了如果用户定义函数中的语句在R中适用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请原谅我,如果这是一个明显的问题,我是初学R用户渴望学习。

Forgive me if this is a blatantly obvious question, I am a beginner R user eager to learn.

我有一个4列的数据框,大约有150万包含坐标信息的行,其中每个单独的行表示特定位置。我想要做的是将这些数据运行到一个函数中,该函数包含一系列if else语句,用于确定较大框内特定位置的区域。例如,一个点可以位于箱子内侧沿着箱子边缘1.5英寸内的中心,但不是在边缘上,也不在中心,或者在箱子的外侧。

I have a data frame of 4 columns with roughly 1.5 million rows containing coordinate information where each individual row represents a specific location. What I would like to do is run these data into a function that holds a series of if else statements that determine the area of the specific location within a larger box. For example, a point can be in the center, along the edge of the box within 1.5 inches, on the inside of the box but not on the edge nor at the center, or on the outside of the box.

每个if语句确定一组点是否在指定区域内,如果是,则结果是if语句在相应的区域放置'1'另一个数据框的行。

Each if statement determines if a set of points is in a specified area, and, if it is, the result is the if statement putting a '1' in the corresponding row of another data frame.

这是我想要做的事情的可视化:

Here is a visualization of what I am trying to do:

拿这个来自称为维度的数据框中的位置数据:

Take this location data from a data frame called 'dimensions':

 sz_top | sz_bot |     px |   pz  |   
  3.526 |   1.615|  -1.165| 3.748 |

通过这些语句运行它(真正的语句要长得多),其中'else'条件意味着这一点完全在盒子之外:

Run it through these statements (the real statements are much longer), where the 'else' condition means the point is outside the box completely:

if(in center) else if(on edge) else if(in box, but not in center or on edge) else

当程序找到哪个条件为真时,它会放1在相应列中称为调用的另一个数据框中(这些列是列50-53)。如果代码发现该点位于中心,那么这就是行的样子:

When the program finds which condition is true, it puts a 1 in ANOTHER data frame called 'call' in the corresponding column (these columns are columns 50-53). This is what the row would look like in the event the code found the point was in the center:

center| edge| other_in| out| 
  1   |  0  |       0 |   0|

要注意的是,提高效率的一点是坐标实际上也包含在'来电'中第22,23,26和27列中的数据框,但我将它们移动到'维度',因为我更容易使用它。这肯定可以改变。

One thing to note that could improve efficiency is that the coordinates are actually also contained in the 'calls' data frame in columns 22,23,26, and 27, but I moved them to 'dimensions' because it was easier for me to work with. This can definitely be changed.

我现在还不清楚如何从这里开始。我写了所有的if else语句,但我不清楚我的程序如何知道它正在哪一行,以正确地用测试结果标记相应的行。

I am now very unclear on how to proceed from here. I have all my if else statement written, but I am unclear on how my program will know which row it is on as to correctly mark the corresponding row with the result of the tests.

如果您需要我的更多信息,请与我们联系。

Please let me know if you would like any more information from me.

谢谢!

编辑:

以下是'维度'数据框的示例:

Here is a sample of the 'dimensions' data frame:

sz_top  sz_bot  px  pz
1   3.526   1.615   -1.165  3.748
2   3.29    1.647   -0.412  1.9
3   3.29    1.647   -1.213  1.352
4   3.565   1.75    -1.041  2.419
5   3.565   1.75    -0.357  1.776
6   3.565   1.75    0.838   0.834
7   3.541   1.724   -1.619  3.661
8   3.541   1.724   -2.498  2.421
9   3.541   1.724   -1.673  2.348
10  3.541   1.724   -1.572  2.982
11  3.305   1.5 -1.316  2.842

以下是我的一个if语句的示例。其他人非常相似,只是查看相关方框周围的不同位置:

Here is an example of one of my if statements. The others are fairly similar, just looking at different locations around the box in question:

  if(
    ((as.numeric(as.character(dimensions$px))*12)>= -3)
    &&
      ((as.numeric(as.character(dimensions$px))*12)<= 3)
    &&
      ((as.numeric(as.character(dimensions$pz))*12)<=((as.numeric(as.character(dimensions$sz_top))*12-as.numeric(as.character(dimensions$sz_bot))*12)/2)+(as.numeric(as.character(dimensions$sz_bot))*12)+3)
    &&
      ((as.numeric(as.character(dimensions$pz))*12)>=((as.numeric(as.character(dimensions$sz_top))*12-as.numeric(as.character(dimensions$sz_bot))*12)/2)+(as.numeric(as.character(dimensions$sz_bot))*12)-3)
  ){return(1)
  } 


推荐答案

如果我理解正确,以下将会返回1和0的数字向量,您可以将其插入调用的适当列

If I understand correctly, the following will return a numeric vector of ones and zeros that you can slot into the appropriate column of calls.

dimensions <- read.table(text='sz_top  sz_bot  px  pz
1   3.526   1.615   -1.165  3.748
2   3.29    1.647   -0.412  1.9
3   3.29    1.647   -1.213  1.352
4   3.565   1.75    -1.041  2.419
5   3.565   1.75    -0.357  1.776
6   3.565   1.75    0.838   0.834
7   3.541   1.724   -1.619  3.661
8   3.541   1.724   -2.498  2.421
9   3.541   1.724   -1.673  2.348
10  3.541   1.724   -1.572  2.982
11  3.305   1.5 -1.316  2.842', header=T, row.names=1)


as.numeric(
  dimensions$px*12 >= -3
  & dimensions$px*12 <= 3
  & dimensions$pz*12 <= 
    (dimensions$sz_top*12 - dimensions$sz_bot*12)/2 + (dimensions$sz_bot*12) + 3
  & dimensions$pz*12 >= 
    (dimensions$sz_top*12 - dimensions$sz_bot*12)/2 + (dimensions$sz_bot*12) - 3)

通过使用单个&符号,R计算data.frame的每一行的条件表达式,而不是在首次不满足条件时停止。

By using single ampersands, R evaluates the conditional expression for each row of the data.frame, rather than stopping when the condition is first not met.

我已删除 as.numeric as.character for清晰度(不知道为什么这些都是必要的......这些数据是作为因素读入的吗?如果是这样,也许尝试 stringsAsFactors = FALSE )。

I've removed as.numeric and as.character for clarity (not sure why these are necessary anyway... were these data read in as factors? If so, perhaps try stringsAsFactors = FALSE).

这篇关于如果用户定义函数中的语句在R中适用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆