R在内部如何表示NA? [英] How does R represent NA internally?
问题描述
R似乎在浮点数组中支持有效的NA
值.它在内部如何表示?
我的理解(也许是有缺陷的)是,现代CPU可以在硬件中执行浮点计算,包括对Inf,-Inf和NaN值的有效处理. NA
如何适用于此,如何在不影响性能的情况下实现它?
R使用为 IEEE浮动代表NA_real_
,Inf
和NA
.我们可以使用一个简单的C ++函数对此进行明确显示:
Rcpp::cppFunction('void print_hex(double x) {
uint64_t y;
static_assert(sizeof x == sizeof y, "Size does not match!");
std::memcpy(&y, &x, sizeof y);
Rcpp::Rcout << std::hex << y << std::endl;
}', plugins = "cpp11", includes = "#include <cstdint>")
print_hex(NA_real_)
#> 7ff80000000007a2
print_hex(Inf)
#> 7ff0000000000000
print_hex(-Inf)
#> fff0000000000000
指数(第二到13位)全部为1.这是IEEE NaN的定义.但是,对于Inf
,尾数全为零,而对于NA_real_
,情况并非如此.这里有些源
代码
引用.>
R seems to support an efficient NA
value in floating point arrays. How does it represent it internally?
My (perhaps flawed) understanding is that modern CPUs can carry out floating point calculations in hardware, including efficient handling of Inf, -Inf and NaN values. How does NA
fit into this, and how is it implemented without compromising performance?
R uses NaN values as defined for IEEE floats to represent NA_real_
, Inf
and NA
. We can use a simple C++ function to make this explicit:
Rcpp::cppFunction('void print_hex(double x) {
uint64_t y;
static_assert(sizeof x == sizeof y, "Size does not match!");
std::memcpy(&y, &x, sizeof y);
Rcpp::Rcout << std::hex << y << std::endl;
}', plugins = "cpp11", includes = "#include <cstdint>")
print_hex(NA_real_)
#> 7ff80000000007a2
print_hex(Inf)
#> 7ff0000000000000
print_hex(-Inf)
#> fff0000000000000
The exponent (second till 13. bit) is all one. This is the definition of an IEEE NaN. But while for Inf
the mantissa is all zero, this is not the case for NA_real_
. Here some source
code
references.
这篇关于R在内部如何表示NA?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!