在 DataFrame::create( vec1, vec2 ... ) 中可以添加多少个向量? [英] how many vectors can be added in DataFrame::create( vec1, vec2 ... )?
问题描述
我正在创建一个 DataFrame 来保存一个解析的 haproxy http 日志文件,该文件有很多字段(25+).
I am creating a DataFrame to hold a parsed haproxy http log files which has quite a few fields (25+).
如果我添加超过 20 个向量(每个字段一个),我会得到编译错误:
If I add more than 20 vectors (one for each field), I get the compilation error:
no matching function call to 'create'
创建方法:
return DataFrame::create(
_["clientIp"] = clientIp,
_["clientPort"] = clientPort,
_["acceptDate"] = acceptDate,
_["frontendName"] = frontendName,
_["backendName"] = backendName,
_["serverName"] = serverName,
_["tq"] = tq,
_["tw"] = tw,
_["tc"] = tc,
_["tr"] = tr,
_["tt"] = tt,
_["status_code"] = statusCode,
_["bytes_read"] = bytesRead,
#if CAPTURED_REQUEST_COOKIE_FIELD == 1
_["capturedRequestCookie"] = capturedRequestCookie,
#endif
#if CAPTURED_REQUEST_COOKIE_FIELD == 1
_["capturedResponseCookie"] = capturedResponseCookie,
#endif
_["terminationState"] = terminationState,
_["actconn"] = actconn,
_["feconn"] = feconn,
_["beconn"] = beconn,
_["srv_conn"] = srvConn,
_["retries"] = retries,
_["serverQueue"] = serverQueue,
_["backendQueue"] = backendQueue
);
问题:
- 我是否达到了硬性限制?
- 是否有解决方法可以让我向数据框中添加 20 个以上的向量?
推荐答案
是的,您遇到了一个硬限制——Rcpp
受到 C++98 标准的限制,这需要显式的代码膨胀支持可变参数"参数.本质上,必须为每个使用的 create
函数生成一个新的重载,并且为了避免阻塞编译器,Rcpp
最多只提供 20 个.
Yes, you have hit a hard limit -- Rcpp
is limited by the C++98 standard, which requires explicit code bloat to support 'variadic' arguments. Essentially, a new overload must be generated for each create
function used, and to avoid choking the compiler Rcpp
just provides up to 20.
一种解决方法是使用构建器"类,您可以在其中连续添加元素,然后在最后转换为 DataFrame
.此类类的一个简单示例——我们创建了一个 ListBuilder
对象,我们依次为该对象添加
新列.尝试使用此文件运行 Rcpp::sourceCpp()
以查看输出.
A workaround would be to use a 'builder' class, where you successively add elements, and then convert to DataFrame
at the end. A simple example of such a class -- we create a ListBuilder
object, for which we successively add
new columns. Try running Rcpp::sourceCpp()
with this file to see the output.
#include <Rcpp.h>
using namespace Rcpp;
class ListBuilder {
public:
ListBuilder() {};
~ListBuilder() {};
inline ListBuilder& add(std::string const& name, SEXP x) {
names.push_back(name);
// NOTE: we need to protect the SEXPs we pass in; there is
// probably a nicer way to handle this but ...
elements.push_back(PROTECT(x));
return *this;
}
inline operator List() const {
List result(elements.size());
for (size_t i = 0; i < elements.size(); ++i) {
result[i] = elements[i];
}
result.attr("names") = wrap(names);
UNPROTECT(elements.size());
return result;
}
inline operator DataFrame() const {
List result = static_cast<List>(*this);
result.attr("class") = "data.frame";
result.attr("row.names") = IntegerVector::create(NA_INTEGER, XLENGTH(elements[0]));
return result;
}
private:
std::vector<std::string> names;
std::vector<SEXP> elements;
ListBuilder(ListBuilder const&) {}; // not safe to copy
};
// [[Rcpp::export]]
DataFrame test_builder(SEXP x, SEXP y, SEXP z) {
return ListBuilder()
.add("foo", x)
.add("bar", y)
.add("baz", z);
}
/*** R
test_builder(1:5, letters[1:5], rnorm(5))
*/
PS:使用 Rcpp11
,我们有可变参数函数,因此存在限制已删除.
PS: With Rcpp11
, we have variadic functions and hence the limitations are removed.
这篇关于在 DataFrame::create( vec1, vec2 ... ) 中可以添加多少个向量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!