具有密钥向量的STL映射 [英] STL Map with a Vector for the Key

查看:64
本文介绍了具有密钥向量的STL映射的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理一些二进制数据,这些数据存储在任意长的无符号整数数组中.我发现我有一些重复的数据,并且希望在短期内忽略重复项,并从长远来看消除导致它们的任何错误.

I'm working with some binary data that I have stored in arbitrarily long arrays of unsigned ints. I've found that I have some duplication of data, and am looking to ignore duplicates in the short term and remove whatever bug is causing them in the long term.

我正在考虑在存储每个数据集之前将其插入地图,但前提是在开始时未在地图中找到该数据集.我最初的想法是绘制一个字符串映射,并使用memcpy作为锤子将int强制为字符数组,然后将其复制到字符串中并存储该字符串.之所以失败,是因为我的大量数据在相关数据的开头都包含多个字节的0(又名NULL),因此大部分非常真实的数据都被丢弃了.

I'm looking at inserting each dataset into a map before storing it, but only if it was not found in the map to start with. My initial thought was to have a map of strings and use memcpy as a hammer to force the ints into a character array, and then copy that into a string and store the string. This failed because a good deal of my data contains multiple bytes of 0 (aka NULL) at the front of the relevant data, so a majority of very real data got thrown out.

我的下一次尝试计划为std::map<std::vector<unsigned char>,int>,但是我意识到我不知道地图插入功能是否可以工作.

My next attempt is planned to be std::map<std::vector<unsigned char>,int>, but I'm realizing that I don't know if the map insert function will work.

即使没有建议,这是否可行,还是有更好的方法来解决此问题?

Is this doable, even if ill advised, or is there a better way to approach this problem?

编辑

因此,有人指出我并不清楚自己在做什么,所以这里有一个更好的描述.

So it's been remarked that I didn't make clear what I'm doing, so here's a hopefully better description.

我正在努力生成最小的生成树,因为我有许多包含要使用的实际末端节点的树.目标是选择长度最短且覆盖所有末端节点的树,其中选定的树最多共享一个节点,并且全部相连.我的方法基于二元决策树,但进行了一些更改以希望允许更大的并行性.

I'm working on generating a minimum spanning tree, given that I have a number of trees containing the actual end nodes I'm working with. The goal is to come up with the selection of trees that has the shortest length and that covers all of the end nodes, where the chosen trees share at most one node with each other and are all connected. I'm basing my approach off of a binary decision tree, but making a few changes to hopefully allow for greater parallelism.

我不是采用二叉树方法,而是为每个数据集选择由无符号整数组成的位向量,其中位位置为1表示包含了相应的树.

Rather than taking the binary tree approach I've opted to make a bit vector out of unsigned integers for each dataset, where a 1 in a bit position indicates the inclusion of the corresponding tree.

例如,如果仅将树0包含在5棵树的数据集中,我将以

For example if just tree 0 were included in a 5 tree dataset I would start with

00001

从这里我可以生成:

00011

00101

01001

10001

然后每个都可以并行处理,因为它们都不相互依赖.我对所有单个树(00010、00100等)执行此操作,并且我应该花时间来正式证明它,并且能够一次生成(0,2 ^ n)范围内的所有值而且只有一次.

Each of these can then be processed in parallel, since none of them depend on each other. I do this for all of the single trees (00010, 00100, etc..) and should, I haven't taken the time to formally prove it, be able to generate all values in the range (0,2^n) once and only once.

我开始注意到许多数据集完成所需的时间比我想象的要长得多,并启用了调试输出以查看所有生成的结果,然后通过快速的Perl脚本确认了我有多个过程产生相同的输出.从那时起,我一直在尝试解决重复数据的来源,但收效甚微,我希望这种方法能够很好地工作,让我能够验证正在生成的结果,而无需等待3天计算.

I started to notice that many datasets were taking far longer to complete than I thought they should, and enabled a debugging output to look at all of the generated results, and a quick Perl script later it was confirmed that I had multiple processes generating the same output. Since then I've been trying to resolve where the duplicates are coming from with very little success, and I'm hoping that this will work well enough to let me verify the results that are being generated without the, sometimes, 3 day wait on computations.

推荐答案

您将不会有任何问题,因为std :: vector为您提供了"==",<"和>"运算符:

You will not have problems with that, as std::vector provides you the "==", "<" and ">" operators:

http://en.cppreference.com/w/cpp/container/vector/operator_cmp

这篇关于具有密钥向量的STL映射的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆