如何处理与平台相关的问题。 [英] How to deal with a platform-dependent issue.

查看:64
本文介绍了如何处理与平台相关的问题。的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一些大量使用std :: map的代码。事实上,对于一个典型的

实例,我可能会有大约100K的地图,每个地图只有少量的元素。

(如果你很好奇,这是一个计算几何

应用 - 每个地图代表一个多边形的边缘及其元素

是沿着边缘定位的中间标记,将边缘上的相对位置映射到一些辅助数据。)


在Linux上使用gcc编译时这很好用,但是最近使用CC编译器在Sun上运行
之后(我相信使用RogueWave STL

实现)我发现了可怕的内存性能。内存

的使用量增加了10倍或更多,这对于我的

申请来说是不可接受的。


花了更多的部分后一天试图找出

问题,似乎RogueWave地图执行批量分配,所以

只要一个项目插入到地图中,就会分配空间

为32.(我发现这是通过编写一个分配器来转发内存操作,然后记录每个这样的调用。)这远远超过空间
典型案例需要
(我可能只说3或4个条目),

因此与gcc相比内存使用量大幅增加

版本(一次分配一个条目的空间)。


好​​的,所以我意识到这基本上是一个平台问题,并不完全是

on -topic,但我欢迎任何建议。例如,是否有一种语言

功能我不知道这可以规避这种分配

策略?是否有另一种方法可以在不明确使用地图的情况下实现地图的功能?还有其他建议吗?


感谢您的帮助,

马克

I wrote some code which makes heavy use of std::map. In fact, for a
typical instance, I may have on the order of 100K maps, each with only a
small number of elements.

(In case you''re curious, this is for a computational geometry
application-- each map represents an edge of a polygon and its elements
are intermediate markers located along the edge, mapping relative
position on the edge to some auxiliary data.)

This worked just fine when compiled with gcc on Linux, but after
recently running on Sun with the CC compiler (using the RogueWave STL
implementation, I believe) I found horrible memory performance. Memory
usage jumped by a factor of 10 or more which is unacceptable for my
application.

After spending the better part of a day trying to figure out the
problem, it seems that the RogueWave map performs bulk allocation so
that as soon as one item is inserted into the map, it allocates space
for 32. (I found this by writing an allocator to forward memory ops to
malloc/free and log each such call.) This is much more than the space
needed for a typical case (where I might have say 3 or 4 entries only),
hence the tremendous increase in memory usage compared to the gcc
version (which allocates space one entry at a time).

OK, so I realize this is fundamentally a platform issue and not entirely
on-topic, but I''d welcome any advice. For example, is there a language
feature I don''t know about that could circumvent this allocation
strategy? Is there another approach that achieves the functionality of
a map without making explicit use thereof? Any other recommendations?

Thanks for your help,
Mark

推荐答案

Mark P写道:
Mark P wrote:
我写了一些大量使用std :: map的代码。事实上,对于一个典型的实例,我可能有100K的地图,每个地图只有少量的元素。

(如果你是好奇,这是一个计算几何应用 - 每个地图代表一个多边形的边缘,它的元素是沿着边缘定位的中间标记,将边缘上的相对位置映射到一些辅助数据。)
I wrote some code which makes heavy use of std::map. In fact, for a
typical instance, I may have on the order of 100K maps, each with only a
small number of elements.

(In case you''re curious, this is for a computational geometry
application-- each map represents an edge of a polygon and its elements
are intermediate markers located along the edge, mapping relative
position on the edge to some auxiliary data.)




[...]


我不知道这是否有帮助,但你可能想看看:

http://www.boost.org/libs/property_m...perty_map.html



[...]

I don''t know if this will help or not, but you might want to take a look at:

http://www.boost.org/libs/property_m...perty_map.html


* Mark P:
我写了一些大量使用std :: map的代码。事实上,对于一个典型的实例,我可能有100K的地图,每个地图只有少量的元素。

(如果你是好奇,这是一个计算几何应用 - 每个地图代表一个多边形的边缘,它的元素是沿着边缘定位的中间标记,将边缘上的相对位置映射到一些辅助数据。)


我不知道它是如何工作的,但是,接受它确实有用:


在Linux上使用gcc编译时很好,但是最近使用CC编译器在Sun上运行(使用RogueWave STL
实现后,我相信)我发现了可怕的内存性能。内存使用量增加了10倍或更多,这对我的应用来说是不可接受的。

花了一天的大部分时间试图找出
问题,似乎RogueWave映射执行批量分配,以便只要一个项目插入到地图中,它就会为32分配空间。(我通过编写一个分配器来转发内存操作找到了这个来自malloc / free并记录每个这样的电话。)这不仅仅是典型案例所需的空间(我可能只说3或4个条目),
因此与gcc
版本(一次分配一个条目空间)相比,内存使用量大幅增加。


出于好奇,通过分配器你的意思是替换全局''新''

或者你是否写了一个分配器来传递''map''的模板参数?


如果是后者那么批量分配是''map''

实现的一部分(例如B -tree like thing)。


这将排除基于提供你自己的分配器

模板参数的解决方案。


好​​的,所以我意识到这基本上是一个平台问题,并不完全是关于主题的,但我会欢迎任何建议。例如,是否有一种语言
功能我不知道可以绕过这种分配策略?是否有另一种方法可以在不明确使用地图的情况下实现地图的功能?还有其他建议吗?
I wrote some code which makes heavy use of std::map. In fact, for a
typical instance, I may have on the order of 100K maps, each with only a
small number of elements.

(In case you''re curious, this is for a computational geometry
application-- each map represents an edge of a polygon and its elements
are intermediate markers located along the edge, mapping relative
position on the edge to some auxiliary data.)
I don''t see how that would work, but, accepting that it does work:

This worked just fine when compiled with gcc on Linux, but after
recently running on Sun with the CC compiler (using the RogueWave STL
implementation, I believe) I found horrible memory performance. Memory
usage jumped by a factor of 10 or more which is unacceptable for my
application.

After spending the better part of a day trying to figure out the
problem, it seems that the RogueWave map performs bulk allocation so
that as soon as one item is inserted into the map, it allocates space
for 32. (I found this by writing an allocator to forward memory ops to
malloc/free and log each such call.) This is much more than the space
needed for a typical case (where I might have say 3 or 4 entries only),
hence the tremendous increase in memory usage compared to the gcc
version (which allocates space one entry at a time).
Just out of curiosity, by ''allocator'' do you mean replacing global ''new''
or did you write an allocator to pass as template argument to ''map''?

If the latter then the bulk allocation is part of the ''map''
implementation (e.g. a B-tree like thing).

And that would preclude a solution based on supplying your own allocator
template argument.

OK, so I realize this is fundamentally a platform issue and not entirely
on-topic, but I''d welcome any advice. For example, is there a language
feature I don''t know about that could circumvent this allocation
strategy? Is there another approach that achieves the functionality of
a map without making explicit use thereof? Any other recommendations?




看来你需要一个针对3到4个元素优化的类似地图的类,但是

也能够处理更大的尺寸。


我先尝试一下基于std :: vector的实现:线性搜索

几乎总是最好的(:-))对于极少数的

元素;将它包装在提供您使用的地图操作的类中,并且

将容量设置为例如2或4号施工;测量什么是最好的。


如果元素数量较多的地图数量足够多,那么简单的方法就不够有效那么也许

考虑更复杂的事情,比如,你已经有了一个实例,这个实现足够有效,另一种方法可能是
是动态切换你的包装器的内部实现

(这是什么样的模式名称,比如句柄/信封?),

虽然这会引入一些开销单独。


-

答:因为它弄乱了人们通常阅读文本的顺序。

问:为什么这么糟糕?

A:热门发布。

问:usenet和电子邮件中最烦人的是什么?



It seems you need a map-like class optimized for 3 to 4 elements, but
also able to handle larger sizes.

I''d try out an implementation based on std::vector first: linear search
is almost always the fastestest ( :-) ) for a very small number of
elements; wrap it in a class providing the ''map'' operations you use, and
set the capacity to e.g. 2 or 4 on construction; measure what''s best.

If the number of maps with larger number of elements is sufficiently
large that the simple approach isn''t efficient enough then perhaps
consider more complicated things, like, you already have an example of
an implementation that is efficient enough, and another approach might
be to dynamically switch internal implementation of your wrappers
(what''s the pattern name for that, something like handle/envelope?),
although that would introduce some overhead by itself.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?


Alf P. Steinbach写道:
Alf P. Steinbach wrote:
* Mark P:
* Mark P:
我写了一些代码, avy使用std :: map。事实上,对于一个典型的实例,我可能有100K的地图,每个地图只有少量的元素。

(如果你是好奇,这是一个计算几何应用 - 每个地图代表一个多边形的边缘,它的元素是沿着边缘定位的中间标记,将边缘上的相对位置映射到一些辅助数据。)

我不明白这是怎么回事,但是,接受它确实有效:
I wrote some code which makes heavy use of std::map. In fact, for a
typical instance, I may have on the order of 100K maps, each with only a
small number of elements.

(In case you''re curious, this is for a computational geometry
application-- each map represents an edge of a polygon and its elements
are intermediate markers located along the edge, mapping relative
position on the edge to some auxiliary data.)

I don''t see how that would work, but, accepting that it does work:



嗯,我还没有告诉你应用程序是什么:)它实际上很好地用于它的预期目的(尽管存在以下内存问题)。


Well, I haven''t told you what the application is :) It actually works
quite nicely for its intended purpose (memory issues below notwithstanding).

这在编译时工作得很好在Linux上使用gcc,但是最近在Sun上使用CC编译器运行(使用RogueWave STL
实现,我相信)我发现了可怕的内存性能。内存使用量增加了10倍或更多,这对我的应用来说是不可接受的。

花了一天的大部分时间试图找出
问题,似乎RogueWave映射执行批量分配,以便只要一个项目插入到地图中,它就会为32分配空间。(我通过编写一个分配器来转发内存操作找到了这个来自malloc / free并记录每个这样的电话。)这不仅仅是典型案例所需的空间(我可能只说3或4个条目),
因此与gcc
版本(一次分配空间一个条目)相比,内存使用量大幅增加。

只是出于好奇,分配器,你的意思是取代全球 'new''
或者你是否写了一个分配器作为模板参数传递给''map''?
This worked just fine when compiled with gcc on Linux, but after
recently running on Sun with the CC compiler (using the RogueWave STL
implementation, I believe) I found horrible memory performance. Memory
usage jumped by a factor of 10 or more which is unacceptable for my
application.

After spending the better part of a day trying to figure out the
problem, it seems that the RogueWave map performs bulk allocation so
that as soon as one item is inserted into the map, it allocates space
for 32. (I found this by writing an allocator to forward memory ops to
malloc/free and log each such call.) This is much more than the space
needed for a typical case (where I might have say 3 or 4 entries only),
hence the tremendous increase in memory usage compared to the gcc
version (which allocates space one entry at a time).

Just out of curiosity, by ''allocator'' do you mean replacing global ''new''
or did you write an allocator to pass as template argument to ''map''?




后者。
如果后者然后批量分配是''map''
imp的一部分lementation(例如像B树一样的东西)。


理解。我相信这是一棵红黑树。 STL的实现在我的经验中是难以理解的,但在地图中隐藏的地方

实现是一个参数__buffer_size乍一看似乎是

负责批量分配。

这将排除基于提供您自己的分配器
模板参数的解决方案。

也理解。



The latter.
If the latter then the bulk allocation is part of the ''map''
implementation (e.g. a B-tree like thing).
Understood. I believe it''s a red-black tree. STL implementations are a
bit inscrutable in my experience, but somewhere buried in the map
implementation is a parameter __buffer_size which at first glance seems
to be responsible for the bulk allocation.

And that would preclude a solution based on supplying your own allocator
template argument.
Also understood.

好的,所以我意识到这基本上是一个平台问题,并不完全是关于主题的,但我会欢迎任何建议。例如,是否有一种语言
功能我不知道可以绕过这种分配策略?是否有另一种方法可以在不明确使用地图的情况下实现地图的功能?还有其他建议吗?

看起来你需要一个针对3到4个元素优化的类似地图的类,但是
也能够处理更大的尺寸。
OK, so I realize this is fundamentally a platform issue and not entirely
on-topic, but I''d welcome any advice. For example, is there a language
feature I don''t know about that could circumvent this allocation
strategy? Is there another approach that achieves the functionality of
a map without making explicit use thereof? Any other recommendations?

It seems you need a map-like class optimized for 3 to 4 elements, but
also able to handle larger sizes.




是的,那太棒了。有一个?

我会首先尝试基于std :: vector的实现:线性搜索
对于极少数的<几乎总是最好的(:-)) br />元素;将它包装在提供您使用的地图操作的类中,并将容量设置为例如2或4号施工;衡量什么是最好的。


这当然是一个值得考虑的想法。问题是这个

结构是动态更新的,需要在中间插入

(w.r.t。排序),范围操作等等。考虑到这个建议,还是值得



如果元素数量较多的地图数量足够大,那么简单的方法是不是足够有效,然后考虑更复杂的事情,比如,你已经有一个足够有效的实现的例子,另一种方法可能是动态切换内部实现你的包装器
(这是什么样的模式名称,比如句柄/信封?),
虽然这本身会带来一些开销。



Yep, that''s be great. Got one?

I''d try out an implementation based on std::vector first: linear search
is almost always the fastestest ( :-) ) for a very small number of
elements; wrap it in a class providing the ''map'' operations you use, and
set the capacity to e.g. 2 or 4 on construction; measure what''s best.

That''s certainly an idea to consider. The problem is that this
structure is dynamically updated, requiring insertions in the middle
(w.r.t. the sort), range operations, and so on. Still it''s worth
considering so thanks for the suggestion.
If the number of maps with larger number of elements is sufficiently
large that the simple approach isn''t efficient enough then perhaps
consider more complicated things, like, you already have an example of
an implementation that is efficient enough, and another approach might
be to dynamically switch internal implementation of your wrappers
(what''s the pattern name for that, something like handle/envelope?),
although that would introduce some overhead by itself.



我也提出了其他一些想法:


1.现在地图上有各自约12个单词的对象。我可以将
存储指针,以便额外的未使用节点更小。

不幸的是,节点的大小仍然与其分开

数据至少增加几个字,并且管理原始指针可能是痛苦的



2.而不是每个边缘有一个地图我比方说,每个

多边形可以有一个地图,并使用更复杂的排序来保持所有中间的
点数。这似乎也是一种痛苦,它打破了边缘

抽象,并且在一个更大的结构中进行搜索会产生效率损失。


我认为我会满意,至少暂时的,结构是

支持线性时间搜索和恒定时间插入。这表明

一个列表,除了列表实现还在第一个元素之后以
32的块分配。从正面看,与列表节点相比,与列表节点相关联的开销较少




所有这些都令人惊讶,当我停下来思考哲学时,为什么

S(标准)TL并没有更加标准化。



I''ve come up with a couple other ideas too:

1. Right now the map holds objects which are about 12 words each. I
could store pointers instead so that the extra unused nodes are smaller.
Unfortunately there''s still the size of the node separate from its
data which adds a few words at least, and managing raw pointers can be a
pain.

2. Rather than have one map per edge I could have, say, one map per
polygon and use a more complicated sort to keep all the intermediate
points straight. This also seems like a pain, it breaks the edge
abstraction, and there''s an efficiency penalty from searching in a
larger structure.

I think I''d be satisfied, at least provisionally, with a structure that
supports linear time searches and constant time inserts. This suggests
a list except that the list implementation also allocates in blocks of
32 after the first element. On the plus side there''s less overhead
associated with the nodes of a list compared to those of a map.

All of which makes wonder, as I pause to wax philosophical, why the
S(tandard)TL isn''t more standardized.


这篇关于如何处理与平台相关的问题。的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆