R在内部如何表示NA? [英] How does R represent NA internally?

查看:140
本文介绍了R在内部如何表示NA?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

R似乎在浮点数组中支持有效的NA值.它在内部如何表示?

我的理解(也许是有缺陷的)是,现代CPU可以在硬件中执行浮点计算,包括对Inf,-Inf和NaN值的有效处理. NA如何适用于此,如何在不影响性能的情况下实现它?

解决方案

R使用为 Rcpp::cppFunction('void print_hex(double x) { uint64_t y; static_assert(sizeof x == sizeof y, "Size does not match!"); std::memcpy(&y, &x, sizeof y); Rcpp::Rcout << std::hex << y << std::endl; }', plugins = "cpp11", includes = "#include <cstdint>") print_hex(NA_real_) #> 7ff80000000007a2 print_hex(Inf) #> 7ff0000000000000 print_hex(-Inf) #> fff0000000000000

指数(第二到13位)全部为1.这是IEEE NaN的定义.但是,对于Inf,尾数全为零,而对于NA_real_,情况并非如此.这里有些 代码 引用.

R seems to support an efficient NA value in floating point arrays. How does it represent it internally?

My (perhaps flawed) understanding is that modern CPUs can carry out floating point calculations in hardware, including efficient handling of Inf, -Inf and NaN values. How does NA fit into this, and how is it implemented without compromising performance?

解决方案

R uses NaN values as defined for IEEE floats to represent NA_real_, Inf and NA. We can use a simple C++ function to make this explicit:

Rcpp::cppFunction('void print_hex(double x) {
    uint64_t y;
    static_assert(sizeof x == sizeof y, "Size does not match!");
    std::memcpy(&y, &x, sizeof y);
    Rcpp::Rcout << std::hex << y << std::endl;
}', plugins = "cpp11", includes = "#include <cstdint>")
print_hex(NA_real_)
#> 7ff80000000007a2
print_hex(Inf)
#> 7ff0000000000000
print_hex(-Inf)
#> fff0000000000000

The exponent (second till 13. bit) is all one. This is the definition of an IEEE NaN. But while for Inf the mantissa is all zero, this is not the case for NA_real_. Here some source code references.

这篇关于R在内部如何表示NA?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆