在 DataFrame::create( vec1, vec2 ... ) 中可以添加多少个向量? [英] how many vectors can be added in DataFrame::create( vec1, vec2 ... )?

查看:38
本文介绍了在 DataFrame::create( vec1, vec2 ... ) 中可以添加多少个向量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在创建一个 DataFrame 来保存一个解析的 haproxy http 日志文件,该文件有很多字段(25+).

I am creating a DataFrame to hold a parsed haproxy http log files which has quite a few fields (25+).

如果我添加超过 20 个向量(每个字段一个),我会得到编译错误:

If I add more than 20 vectors (one for each field), I get the compilation error:

no matching function call to 'create'

创建方法:

    return DataFrame::create(
      _["clientIp"]     = clientIp,
      _["clientPort"]   = clientPort,
      _["acceptDate"]   = acceptDate,
      _["frontendName"] = frontendName,
      _["backendName"]  = backendName,
      _["serverName"]   = serverName,
      _["tq"]           = tq,
      _["tw"]           = tw,
      _["tc"]           = tc,
      _["tr"]           = tr,
      _["tt"]           = tt,
      _["status_code"]  = statusCode,
      _["bytes_read"]   = bytesRead,

#if CAPTURED_REQUEST_COOKIE_FIELD == 1
      _["capturedRequestCookie"]   = capturedRequestCookie,
#endif     

#if CAPTURED_REQUEST_COOKIE_FIELD == 1
      _["capturedResponseCookie"]   = capturedResponseCookie,
#endif    

      _["terminationState"] = terminationState,
      _["actconn"]          = actconn,
      _["feconn"]           = feconn,
      _["beconn"]           = beconn,
      _["srv_conn"]         = srvConn,
      _["retries"]          = retries,
      _["serverQueue"]      = serverQueue,
      _["backendQueue"]     = backendQueue 
    );

问题:

  1. 我是否达到了硬性限制?
  2. 是否有解决方法可以让我向数据框中添加 20 个以上的向量?

推荐答案

是的,您遇到了一个硬限制——Rcpp 受到 C++98 标准的限制,这需要显式的代码膨胀支持可变参数"参数.本质上,必须为每个使用的 create 函数生成一个新的重载,并且为了避免阻塞编译器,Rcpp 最多只提供 20 个.

Yes, you have hit a hard limit -- Rcpp is limited by the C++98 standard, which requires explicit code bloat to support 'variadic' arguments. Essentially, a new overload must be generated for each create function used, and to avoid choking the compiler Rcpp just provides up to 20.

一种解决方法是使用构建器"类,您可以在其中连续添加元素,然后在最后转换为 DataFrame.此类类的一个简单示例——我们创建了一个 ListBuilder 对象,我们依次为该对象添加 新列.尝试使用此文件运行 Rcpp::sourceCpp() 以查看输出.

A workaround would be to use a 'builder' class, where you successively add elements, and then convert to DataFrame at the end. A simple example of such a class -- we create a ListBuilder object, for which we successively add new columns. Try running Rcpp::sourceCpp() with this file to see the output.

#include <Rcpp.h>
using namespace Rcpp;

class ListBuilder {

public:

   ListBuilder() {};
   ~ListBuilder() {};

   inline ListBuilder& add(std::string const& name, SEXP x) {
      names.push_back(name);

      // NOTE: we need to protect the SEXPs we pass in; there is
      // probably a nicer way to handle this but ...
      elements.push_back(PROTECT(x));

      return *this;
   }

   inline operator List() const {
      List result(elements.size());
      for (size_t i = 0; i < elements.size(); ++i) {
         result[i] = elements[i];
      }
      result.attr("names") = wrap(names);
      UNPROTECT(elements.size());
      return result;
   }

   inline operator DataFrame() const {
      List result = static_cast<List>(*this);
      result.attr("class") = "data.frame";
      result.attr("row.names") = IntegerVector::create(NA_INTEGER, XLENGTH(elements[0]));
      return result;
   }

private:

   std::vector<std::string> names;
   std::vector<SEXP> elements;

   ListBuilder(ListBuilder const&) {}; // not safe to copy

};

// [[Rcpp::export]]
DataFrame test_builder(SEXP x, SEXP y, SEXP z) {
   return ListBuilder()
      .add("foo", x)
      .add("bar", y)
      .add("baz", z);
}

/*** R
test_builder(1:5, letters[1:5], rnorm(5))
*/

PS:使用 Rcpp11,我们有可变参数函数,因此存在限制已删除.

PS: With Rcpp11, we have variadic functions and hence the limitations are removed.

这篇关于在 DataFrame::create( vec1, vec2 ... ) 中可以添加多少个向量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆