分区/批号/块容器到使用std算法相同大小的块 [英] Partitioning/batch/chunk a container into equal sized pieces using std algorithms
问题描述
我碰到的情况下,我不得不批处理一组记录了一个数据库。我想知道我如何与 STD算法
做到这一点由于10002的记录,我希望它分割成100条记录进行处理箱,其余为2的纸盒。
幸运地是,code将更好地说明什么,我试图完成。我对涉及迭代器解决方案完全开放的,lambda表达式任何形式的现代C ++的乐趣。
的#include<&了cassert GT;
#包括LT&;矢量>
#包括LT&;&算法GT;模板<类型名T>
的std ::矢量<的std ::矢量< T> >块(性病::矢量< T>常量和放大器;集装箱,为size_t CHUNK_SIZE)
{
返回的std ::矢量<的std ::矢量< T> >();
}诠释的main()
{
INT I = 0;
常量为size_t test_size = 11;
的std ::矢量<&INT GT;容器(test_size);
的std :: generate_n(性病::开始(集装箱),test_size,[&安培; I](){返回++我;}); 汽车块=块(容器,3); 断言(chunks.size()== 4安培;&安培;应该是四个单元);
断言(块[0] .size()== 3及&安培;第几个块中应该有理想的块大小);
断言(chunks.back()大小()== 2及。&放大器;最后一个块应具有的其余2个元素); 返回0;
}
这个问题似乎是在的std :: for_each的
的变化时,每一个你想去的地方操作上是您的收藏的间隔。因此,你会preFER写一个lambda(或功能),它有两个迭代器定义每个间隔的开始和结束,传递的lambda /功能,你的算法。
这就是我想出了...
//(头略)模板< TYPENAME迭代器>
无效for_each_interval(
迭代开始
,迭代结束
,为size_t interval_size
,性病::功能<无效(迭代器,迭代器)GT;操作)
{
自动为=开始; 同时(以!=结束)
{
从=自动; 自动计数器= interval_size;
而(计数器大于0和放大器;&安培;!为=结束)
{
++来;
- 计数器;
} 操作(从,到);
}
}
(我希望的std ::提前
将需要使用计数
来增加内部循环的护理到
,但不幸的是盲目超越步结束[我很想写我自己的 smart_advance
模板来封装这种]如果,将工作,这将减少约一半code的量!)
现在一些code,以测试它...
//(头略)INT主(INT ARGC,CHAR *的argv [])
{
//一些测试数据
INT foo的[10] = {0,1,2,3,4,5,6,7,8,9};
的std ::矢量<&INT GT; my_data(富,富+ 10);
为size_t常量间隔= 3; 的typedef decltype(my_data.begin())iter_t;
for_each_interval< iter_t>(my_data.begin(),my_data.end(),间隔时间,
[](iter_t从,iter_t到)
{
性病::法院LT&;< 间隔:;
的std :: for_each的(从,到,
[&放大器;(INT VAL)
{
性病::法院LT&;< << VAL;
});
性病::法院LT&;<的std :: ENDL;
});
}
这将产生以下输出,这点我觉得再presents你想要什么:
间隔:0 1 2
间隔:3 4 5
间隔:6 7 8
间隔:9
I came across a situation where I had to batch process a set of records off to a database. I am wondering how I could accomplish this with std algorithms.
Given 10002 records I want partition it into bins of 100 records for processing, with the remainder being a bin of 2.
Hopefully the following code will better illustrate what I'm trying to accomplish. I'm completely open to solutions involving iterators, lambdas any sort of modern C++ fun.
#include <cassert>
#include <vector>
#include <algorithm>
template< typename T >
std::vector< std::vector< T > > chunk( std::vector<T> const& container, size_t chunk_size )
{
return std::vector< std::vector< T > >();
}
int main()
{
int i = 0;
const size_t test_size = 11;
std::vector<int> container(test_size);
std::generate_n( std::begin(container), test_size, [&i](){ return ++i; } );
auto chunks = chunk( container, 3 );
assert( chunks.size() == 4 && "should be four chunks" );
assert( chunks[0].size() == 3 && "first several chunks should have the ideal chunk size" );
assert( chunks.back().size() == 2 && "last chunk should have the remaining 2 elements" );
return 0;
}
The problem seems to be a variation on std::for_each
, where the "each" you want to operate on is an interval of your collection. Thus you would prefer to write a lambda (or function) that takes two iterators defining the start and end of each interval and pass that lambda/function to your algorithm.
Here's what I came up with...
// (Headers omitted)
template < typename Iterator >
void for_each_interval(
Iterator begin
, Iterator end
, size_t interval_size
, std::function<void( Iterator, Iterator )> operation )
{
auto to = begin;
while ( to != end )
{
auto from = to;
auto counter = interval_size;
while ( counter > 0 && to != end )
{
++to;
--counter;
}
operation( from, to );
}
}
(I wish that std::advance
would take care of the inner loop that uses counter
to increment to
, but unfortunately it blindly steps beyond the end [I'm tempted to write my own smart_advance
template to encapsulate this]. If that would work, it would reduce the amount of code by about half!)
Now for some code to test it...
// (Headers omitted)
int main( int argc, char* argv[] )
{
// Some test data
int foo[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
std::vector<int> my_data( foo, foo + 10 );
size_t const interval = 3;
typedef decltype( my_data.begin() ) iter_t;
for_each_interval<iter_t>( my_data.begin(), my_data.end(), interval,
[]( iter_t from, iter_t to )
{
std::cout << "Interval:";
std::for_each( from, to,
[&]( int val )
{
std::cout << " " << val;
} );
std::cout << std::endl;
} );
}
This produces the following output, which I think represents what you want:
Interval: 0 1 2 Interval: 3 4 5 Interval: 6 7 8 Interval: 9
这篇关于分区/批号/块容器到使用std算法相同大小的块的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!