R中的round函数是否有错误? [英] Is there an error in round function in R?

查看:40
本文介绍了R中的round函数是否有错误?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

round 函数似乎有错误.下面我希望它返回 6,但它返回 5.

It seems there is an error in round function. Below I would expect it to return 6, but it returns 5.

round(5.5) 
# 5

否则 5.5,例如 6.5、4.5 返回 7、5,正如我们所料.

Other then 5.5, such as 6.5, 4.5 returns 7, 5 as we expect.

有什么解释吗?

推荐答案

此行为在 ?round 函数的帮助文件中有说明:

This behaviour is explained in the help file of the ?round function:

请注意,对于 5 的四舍五入,IEC 60559 标准预计将被使用,转到偶数".因此 round(0.5) 是 0 并且回合(-1.5)是-2.但是,这取决于操作系统服务和表示错误(因为例如 0.15 没有被准确表示,舍入规则适用于表示的数字,而不适用于打印的数字,因此 round(0.15, 1) 可以是 0.1 或 0.2.

Note that for rounding off a 5, the IEC 60559 standard is expected to be used, ‘go to the even digit’. Therefore round(0.5) is 0 and round(-1.5) is -2. However, this is dependent on OS services and on representation error (since e.g. 0.15 is not represented exactly, the rounding rule applies to the represented number and not to the printed number, and so round(0.15, 1) could be either 0.1 or 0.2).

round( .5 + 0:10 )
#### [1]  0  2  2  4  4  6  6  8  8 10 10

Greg Snow 的另一个相关电子邮件交流:R:round(1.5) = 回合(2.5) = 2?:

Another relevant email exchange by Greg Snow: R: round(1.5) = round(2.5) = 2?:

本轮平衡规则背后的逻辑是我们正试图代表一个潜在的连续值,如果 x 来自一个真正的连续分布,则 x==2.5 的概率为 0 且2.5 可能已经从 2.45 和 2.54999999999999 之间的任何值四舍五入一次...,如果我们使用我们在小学学到的 0.5 规则的四舍五入,那么双重四舍五入意味着值2.45 和 2.50 之间将全部舍入为 3(已先舍入到 2.5).这将倾向于向上偏差估计.要删除我们需要在舍入到 2.5 之前返回到偏差(这是通常不可能不切实际),或者只是把一半的时间凑齐舍入一半的时间(或者更好的是舍入成正比我们看到低于或高于 2.5 的值四舍五入为 2.5 的可能性有多大,但是对于大多数基础分布,这将接近 50/50).这随机方法是随机具有圆形函数选择要舍入的方式,但确定性类型不是对此感到满意,因此选择了圆形到偶数"(圆形到奇数应该大致相同)作为一个一致的规则,四舍五入和下降约 50/50.

The logic behind the round to even rule is that we are trying to represent an underlying continuous value and if x comes from a truly continuous distribution, then the probability that x==2.5 is 0 and the 2.5 was probably already rounded once from any values between 2.45 and 2.54999999999999..., if we use the round up on 0.5 rule that we learned in grade school, then the double rounding means that values between 2.45 and 2.50 will all round to 3 (having been rounded first to 2.5). This will tend to bias estimates upwards. To remove the bias we need to either go back to before the rounding to 2.5 (which is often impossible to impractical), or just round up half the time and round down half the time (or better would be to round proportional to how likely we are to see values below or above 2.5 rounded to 2.5, but that will be close to 50/50 for most underlying distributions). The stochastic approach would be to have the round function randomly choose which way to round, but deterministic types are not comforatable with that, so "round to even" was chosen (round to odd should work about the same) as a consistent rule that rounds up and down about 50/50.

如果您正在处理 2.5 可能代表精确的数据价值(例如金钱),那么你可以通过乘以所有值乘以 10 或 100 并使用整数,然后仅转换回来用于最终印刷.请注意,2.50000001 舍入为 3,因此如果您保持更多的精度直到最终打印,然后四舍五入将朝着预期的方向发展,或者您可以添加 0.000000001(或其他小数)在四舍五入之前添加到您的值,但这可以向上偏移您的估计.

If you are dealing with data where 2.5 is likely to represent an exact value (money for example), then you may do better by multiplying all values by 10 or 100 and working in integers, then converting back only for the final printing. Note that 2.50000001 rounds to 3, so if you keep more digits of accuracy until the final printing, then rounding will go in the expected direction, or you can add 0.000000001 (or other small number) to your values just before rounding, but that can bias your estimates upwards.

这篇关于R中的round函数是否有错误?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆